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Preface 


Linear Algebra: A First Course presents an introduction to the fascinating subject of 
linear algebra. As the title suggests, this text is designed as a first course in linear algebra 
for students who have a reasonable understanding of basic algebra. Major topics of linear 
algebra are presented in detail, with proofs of important theorems provided. Connections 
to additional topics covered in advanced courses are introduced, in an effort to assist those 
students who are interested in continuing on in linear algebra. 

Each chapter begins with a list of desired outcomes which a student should be able to 
achieve upon completing the chapter. Throughout the text, examples and diagrams are given 
to reinforce ideas and provide guidance on how to approach various problems. Suggested 
exercises are given at the end of each section, and students are encouraged to work through 
a selection of these exercises. 

A brief review of complex numbers is given, which can serve as an introduction to anyone 
unfamiliar with the topic. 

Linear algebra is a wonderful and interesting subject, which should not be limited to a 
challenge of correct arithmetic. The use of a computer algebra system can be a great help 
in long and difficult computations. Some of the standard computations of linear algebra are 
easily done by the computer, including finding the reduced row-echelon form. While the use 
of a computer system is encouraged, it is not meant to be done without the student having 
an understanding of the computations. 
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1. Systems of Equations 


1.1 Systems of Equations, Geometry 


Outcomes 


A. Relate the types of solution sets of a system of two (three) variables to the 
intersections of lines in a plane (the intersection of planes in three space) 


As you may remember, linear equations like 2x + 3y = 6 can be graphed as straight lines 
in the coordinate plane. We say that this equation is in two variables, in this case x and 
y. Suppose you have two such equations, each of which can be graphed as a straight line, 
and consider the resulting graph of two lines. What would it mean if there exists a point 
of intersection between the two lines? This point, which lies on both graphs, gives x and y 
values for which both equations are true. In other words, this point gives the ordered pair 
( x , y) that satisfy both equations. If the point ( x , y) is a point of intersection, we say that 
(x, y) is a solution to the two equations. In linear algebra, we often are concerned with 
finding the solution(s) to a system of equations, if such solutions exist. First, we consider 
graphical representations of solutions and later we will consider the algebraic methods for 
finding solutions. 

When looking for the intersection of two lines in a graph, several situations may arise. 
The following picture demonstrates the possible situations when considering two equations 
(two lines in the graph) involving two variables. 





One Solution 


No Solutions 


Infinitely Many Solutions 


In the first diagram, there is a unique point of intersection, which means that there is only 
one (unique) solution to the two equations. In the second, there are no points of intersection 
and no solution. When no solution exists, this means that the two lines are parallel and they 
never intersect. The third situation which can occur, as demonstrated in diagram three, is 
that the two lines are really the same line. For example, x + y = 1 and 2x + 2y = 2 are 
equations which when graphed yield the same line. In this case there are infinitely many 
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points which are solutions of these two equations, as every ordered pair which is on the 
graph of the line satisfies both equations. When considering linear systems of equations, 
there are always three types of solutions possible; exactly one (unique) solution, infinitely 
many solutions, or no solution. 



Solution. Through graphing the above equations and identifying the point of intersection, we 
can find the solution(s). Remember that we must have either one solution, infinitely many, or 
no solutions at all. The following graph shows the two equations, as well as the intersection. 
Remember, the point of intersection represents the solution of the two equations, or the 
(x,y) which satisfy both equations. In this case, there is one point of intersection at (—1,4) 
which means we have one unique solution, x = — 1 ,y = 4. 



□ 

In the above example, we investigated the intersection point of two equations in two 
variables, x and y. Now we will consider the graphical solutions of three equations in two 
variables. 

Consider a system of three equations in two variables. Again, these equations can be 
graphed as straight lines in the plane, so that the resulting graph contains three straight 
lines. Recall the three possible types of solutions; no solution, one solution, and infinitely 
many solutions. There are now more complex ways of achieving these situations, due to the 
presence of the third line. For example, you can imagine the case of three intersecting lines 
having no common point of intersection. Perhaps you can also imagine three intersecting 
lines which do intersect at a single point. These two situations are illustrated below. 
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No Solution One Solution 


Consider the first picture above. While all three lines intersect with one another, there 
is no common point of intersection where all three lines meet at one point. Hence, there is 
no solution to the three equations. Remember, a solution is a point (x, y ) which satisfies all 
three equations. In the case of the second picture, the lines intersect at a common point. 
This means that there is one solution to the three equations whose graphs are the given lines. 
You should take a moment now to draw the graph of a system which results in three parallel 
lines. Next, try the graph of three identical lines. Which type of solution is represented in 
each of these graphs? 

We have now considered the graphical solutions of systems of two equations in two 
variables, as well as three equations in two variables. However, there is no reason to limit 
our investigation to equations in two variables. We will now consider equations in three 
variables. 

You may recall that equations in three variables, such as 2x + Ay — 5z = 8, form a 
plane. Above, we were looking for intersections of lines in order to identify any possible 
solutions. When graphically solving systems of equations in three variables, we look for 
intersections of planes. These points of intersection give the (x, y , z) that satisfy all the 
equations in the system. What types of solutions are possible when working with three 
variables? Consider the following picture involving two planes, which are given by two 
equations in three variables. 



Notice how these two planes intersect in a line. This means that the points (x, y, z ) on 
this line satisfy both equations in the system. Since the line contains infinitely many points, 
this system has infinitely many solutions. 

It could also happen that the two planes fail to intersect. However, is it possible to have 
two planes intersect at a single point? Take a moment to attempt drawing this situation, and 
convince yourself that it is not possible! This means that when we have only two equations 
in three variables, there is no way to have a unique solution! Hence, the types of solutions 
possible for two equations in three variables are no solution or infinitely many solutions. 
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Now imagine adding a third plane. In other words, consider three equations in three 
variables. What types of solutions are now possible? Consider the following diagram. 



In this diagram, there is no point which lies in all three planes. There is no intersection 
between all planes so there is no solution. The picture illustrates the situation in which the 
line of intersection of the new plane with one of the original planes forms a line parallel to 
the line of intersection of the first two planes. However, in three dimensions, it is possible 
for two lines to fail to intersect even though they are not parallel. Such lines are called skew 
lines. 

Recall that when working with two equations in three variables, it was not possible to 
have a unique solution. Is it possible when considering three equations in three variables? 
In fact, it is possible, and we demonstrate this situation in the following picture. 



In this case, the three planes have a single point of intersection. Can you think of other 
types of solutions possible? Another is that the three planes could intersect in a line, resulting 
in infinitely many solutions, as in the following diagram. 



We have now seen how three equations in three variables can have no solution, a unique 
solution, or intersect in a line resulting in infinitely many solutions. It is also possible that 
the three equations graph the same plane, which also leads to infinitely many solutions. 
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You can see that when working with equations in three variables, there are many more 
ways to achieve the different types of solutions than when working with two variables. It 
may prove enlightening to spend time imagining (and drawing) many possible scenarios, and 
you should take some time to try a few. 

You should also take some time to imagine (and draw) graphs of systems in more than 
three variables. Equations like x + y — 2z + Aw = 8 with more than three variables are 
often called hyper-planes. You may soon realize that it is tricky to draw the graphs of 
hyper-planes! Through the tools of linear algebra, we can algebraically examine these types 
of systems which are difficult to graph. In the following section, we will consider these 
algebraic tools. 


1.1.1. Exercises 

1. Graphically, find the point (xi, yi) which lies on both lines, x + 3y = 1 and Ax — y = 3. 
That is, graph each line and see where they intersect. 

2. Graphically, find the point of intersection of the two lines 3x + y = 3 and x + 2y — 1. 
That is, graph each line and see where they intersect. 

3. You have a system of k equations in two variables, k > 2. Explain the geometric 
significance of 

(a) No solution. 

(b) A unique solution. 

(c) An infinite number of solutions. 
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1.2 Systems Of Equations, 
Algebraic Procedures 


Outcomes 


A. Use elementary operations to find the solution to a linear system of equations. 

B. Find the row-echelon form and reduced row-echelon form of a matrix. 

C. Determine whether a system of linear equations has no solution, a unique solution 
or an infinite number of solutions from its row-echelon form. 

D. Solve a system of equations using Gaussian Elimination and Gauss- Jordan Elim- 
ination. 

E. Model a physical system with linear equations and then solve. 


We have taken an in depth look at graphical representations of systems of equations, as 
well as how to find possible solutions graphically. Our attention now turns to working with 
systems algebraically. 



The relative size of m and n is not important here. Notice that we have allowed a tJ and 
bj to be any real number. We can also call these numbers scalars . We will use this term 
throughout the text, so keep in mind that the term scalar just means that we are working 
with real numbers. 
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Now, suppose we have a system where 6* = 0 for all i. In other words every equation 
equals 0. This is a special type of system. 


Definition 1.3: Homogeneous System 

of Equations 

A system of equations is called homogeneous if each equation in the system is equal 
to 0. A homogeneous system has the form 

CluXi + 012^2 + 

H - ^1 rv^n 0 

O 21 T 1 + 0 , 22 X 2 + 

• * * + &2n x n — 0 

OmlX± T O m 2 X 2 T 

where a t] are scalars and aq are variables. 

“1“ Q j mn%n 0 


Recall from the previous section that our goal when working with systems of linear 
equations was to find the point of intersection of the equations when graphed. In other 
words, we looked for the solutions to the system. We now wish to find these solutions 
algebraically. We want to find values for aq, • • ■ ,x n which solve all of the equations. If such 
a set of values exists, we call (aq, • • • ,x n ) the solution set. 

Recall the above discussions about the types of solutions possible. We will see that 
systems of linear equations will have one unique solution, infinitely many solutions, or no 
solution. Consider the following definition. 


Definition 1.4: Consistent and Inconsistent Systems 


A system of linear equations is called consistent if there exists at least one solution. 
It is called inconsistent if there is no solution. 


If you think of each equation as a condition which must be satisfied by the variables, 
consistent would mean there is some choice of variables which can satisfy all the conditions. 
Inconsistent would mean there is no choice of the variables which can satisfy all of the 
conditions. 

The following sections provide methods for determining if a system is consistent or in- 
consistent, and finding solutions if they exist. 


1.2.1. Elementary Operations 


We begin this section with an example. Recall from Example 1.1 that the solution to the 
given system was (x,y) = (—1,4). 
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Example 1.5: Verifying an Ordered Pair is a Solution 


Algebraically verify that (x, y ) = 

(—1,4) is a solution to the following system of 

equations. 



x + y = 3 


y — x = 5 


Solution. By graphing these two equations and identifying the point of intersection, we 
previously found that (x,y) = (—1,4) is the unique solution. 

We can verify algebraically by substituting these values into the original equations, and 
ensuring that the equations hold. First, we substitute the values into the first equation and 
check that it equals 3. 

x + y= (-1) + (4) = 3 

This equals 3 as needed, so we see that (—1, 4) is a solution to the first equation. Substituting 
the values into the second equation yields 

y-x = (4) - (-1) =4 + 1 = 5 

which is true. For (x,y) = (—1,4) each equation is true and therefore, this is a solution to 
the system. □ 

Now, the interesting question is this: If you were not given these numbers to verify, how 
could you algebraically determine the solution? Linear algebra gives us the tools needed to 
answer this question. The following basic operations are important tools that we will utilize. 


Definition 1.6: Elementary Operations 


Elementary operations are those operations consisting of the following. 

1. Interchange the order in which the equations are listed. 

2. Multiply any equation by a nonzero number. 

3. Replace any equation with itself added to a multiple of another equation. 


It is important to note that none of these operations will change the set of solutions of 
the system of equations. In fact, elementary operations are the key tool we use in linear 
algebra to find solutions to systems of equations. 

Consider the following example. 
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Solution. Notice that the second system has been obtained by taking the second equation 
of the first system and adding -2 times the first equation, as follows: 

2x-y + (-2)(x + y) = 8+{-2)(7) 


By simplifying, we obtain 

-3 y = -6 

which is the second equation in the second system. Now, from here we can solve for y and 
see that y = 2. Next, we substitute this value into the first equation as follows 

x + y = x + 2 = 7 

Hence x = 5 and so (x, y) = (5, 2) is a solution to the second system. We want to check if 
(5, 2) is also a solution to the first system. We check this by substituting (x, y) = (5, 2) into 
the system and ensuring the equations are true. 

x + y=( 5) + (2) = 7 
2x — y = 2 (5) — (2) = 8 

Hence, (5, 2) is also a solution to the first system. □ 

This example illustrates how an elementary operation applied to a system of two equations 
in two variables does not affect the solution set. However, a linear system may involve many 
equations and many variables and there is no reason to limit our study to small systems. 
For any size of system in any number of variables, the solution set is still the collection 
of solutions to the equations. In every case, the above operations of Definition 1.6 do not 
change the set of solutions to the system of linear equations. 

In the following theorem, we use the notation Ej to represent an equation, while bi denotes 
a constant. 
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Theorem 1.8: Elementary Operations and Solutions 


Suppose you have a system of two linear equations 


Ei = h 

E 2 = b 2 

(1.1) 

Then the following systems have the same solution set as 1.1: 


1 . 

E'2 = b 2 

Ei = bi 

(1.2) 

2. 

Ei = h 
kE 2 = kb 2 

(1.3) 

for any scalar k, provided k ^ 0. 


3. 

Ei = h 

E 2 + kEi = b 2 + kbi 

(1.4) 

for any scalar k (including k = 0). 



Before we proceed with the proof of Theorem 1.8, let us consider this theorem in context 
of Example 1.7. Then, 

Ei = x + y, &i = 7 
E 2 — 2x - y, b 2 = 8 

Recall the elementary operations that we used to modify the system in the solution to the 
example. First, we added (—2) times the first equation to the second equation. In terms of 
Theorem 1.8, this action is given by 

E 2 + ( — 2) E\ = b 2 + (—2) b\ 

or 

2 x-y + (-2) (x + y) = 8 + (-2) 7 
This gave us the second system in Example 1.7, given by 

Ei — b] 

Eo + (—2) E\ = b 2 + (—2) b\ 

From this point, we were able to find the solution to the system. Theorem 1.8 tells us 
that the solution we found is in fact a solution to the original system. 

We will now prove Theorem 1.8. 

Proof. 
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1. The proof that the systems 1.1 and 1.2 have the same solution set is as follows. Suppose 
that (xi, • • • , x n ) is a solution to E\ = bi,E 2 = b 2 . We want to show that this is a 
solution to the system in 1.2 above. This is clear, because the system in 1.2 is the 
original system, but listed in a different order. Changing the order does not effect the 
solution set, so (xi, • • • , x n ) is a solution to 1.2. 

2. Next we want to prove that the systems 1.1 and 1.3 have the same solution set. That is 
Ei — bi, E 2 — b 2 has the same solution set as the system E\ — bi, kE 2 = kb 2 provided 
k 7^ 0. Let (xi, • • • ,x n ) be a solution of E\ = bi, E 2 = b 2 ,. We want to show that it 
is a solution to E\ — bi, kE 2 = kb 2 . Notice that the only difference between these two 
systems is that the second involves multiplying the equation, E 2 = b 2 by the scalar 
k. Recall that when you multiply both sides of an equation by the same number, the 
sides are still equal to each other. Hence if (xi, • • • , x n ) is a solution to E 2 = b 2 , then 
it will also be a solution to kE 2 = kb 2 . Hence, (xi, • • • , x n ) is also a solution to 1.3. 

Similarly, let (xi, • ■ • ,x n ) be a solution of E x — b \ , kE 2 = kb 2 . Then we can multiply 
the equation kE 2 = kb- 2 by the scalar 1/k, which is possible only because we have 
required that k ^ 0. Just as above, this action preserves equality and we obtain the 
equation E 2 = b 2 . Hence (xi, • • • , x n ) is also a solution to E\ — b \ , E 2 — b 2 . 

3. Finally, we will prove that the systems 1.1 and 1.4 have the same solution set. We will 
show that any solution of E\ — bi, E 2 — b 2 is also a solution of 1.4. Then, we will show 
that any solution of 1.4 is also a solution of E\ = bi,E 2 = b 2 . Let (xi, • • • ,x n ) be a 
solution to Ei = bi, E 2 = b 2 . Then in particular it solves Ei — bi. Hence, it solves the 
first equation in 1.4. Similarly, it also solves E 2 = b 2 . By our proof of 1.3, it also solves 
kEi = kb\. Notice that if we add E 2 and kEi, this is equal to b 2 + kbi. Therefore, if 
(xi, ■ • • , x n ) solves Ei — bi, E 2 — b 2 it must also solve E 2 + kEi = b 2 + kbi. 

Now suppose (xi, • • ■ , x n ) solves the system E\ = bi,E 2 + kEi = b 2 + kbi. Then 
in particular it is a solution of Ei = b\. Again by our proof of 1.3, it is also a 
solution to kE\ = kbi. Now if we subtract these equal quantities from both sides of 
E 2 + kEi = b 2 + kbi we obtain E 2 = b 2l which shows that the solution also satisfies 
Ei = b \ , E 2 = b 2 . 


Stated simply, the above theorem shows that the elementary operations do not change 
the solution set of a system of equations. 

We will now look at an example of a system of three equations and three variables. 
Similarly to the previous examples, the goal is to find values for x, y , z such that each of the 
given equations are satisfied when these values are substituted in. 
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Example 1.9: Solving a System of Equations with 
Elementary Operations 


Find the solutions to the system, 


x + 3y + 6z = 25 


2x + 7 y + 14 z = 58 

(1.5) 

2y + hz = 19 



Solution. We can relate this system to Theorem 1.8 above. In this case, we have 

Ei = x + 3y + 6 z, b\ = 25 
E 2 = 2x + 7y + 14 z, b 2 = 58 
E 3 = 2y + 5z , b 3 = 19 

Theorem 1.8 claims that if we do elementary operations on this system, we will not change 
the solution set. Therefore, we can solve this system using the elementary operations given 
in Definition 1.6. First, replace the second equation by (—2) times the first equation added 
to the second. This yields the system 

x + 3y + 6z = 25 

y + 2z = 8 (1.6) 

2y + 5z = 19 

Now, replace the third equation with (—2) times the second added to the third 
the system 

x T 3y T 6z = 25 
y + 2z = 8 
z = 3 

At this point, we can easily find the solution. Simply take z = 3 and substitute this back 
into the previous equation to solve for y, and similarly to solve for x. 

x T 3y T 6 (3) = x + 3y + 18 = 25 
y + 2(3)=y + 6 = 8 
z = 3 


This yields 
(1.7) 


The second equation is now 

y + 6 = 8 

You can see from this equation that y — 2. Therefore, we can substitute this value into the 
first equation as follows: 

x + 3 (2) + 18 = 25 

By simplifying this equation, we find that x — 1. Hence, the solution to this system is 
(x, y, z) = (1, 2, 3). This process is called back substitution. 
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Alternatively, in 1.7 you could have continued as follows. Add (—2) times the third 
equation to the second and then add (—6) times the second to the first. This yields 

x + 3y = 7 

y = 2 

z = 3 

Now add (—3) times the second to the first. This yields 

x = 1 

y = 2 

z = 3 

a system which has the same solution set as the original system. This avoided back substi- 
tution and led to the same solution set. It is your decision which you prefer to use, as both 
methods lead to the correct solution, (x,y,z) = (1,2,3). □ 


1.2.2. Gaussian Elimination 


The work we did in the previous section will always find the solution to the system. In this 
section, we will explore a less cumbersome way to find the solutions. First, we will represent 
a linear system with an augmented matrix. A matrix is simply a rectangular array of 
numbers. The size or dimension of a matrix is defined as m x n where m is the number 
of rows and n is the number of columns. In order to construct an augmented matrix from 
a linear system, we create a coefficient matrix from the coefficients of the variables in 
the system, as well as a constant matrix from the constants. The coefficients from one 
equation of the system create one row of the augmented matrix. 

For example, consider the linear system in Example 1.9 


x + 3y + 6z = 25 
2x + 7y + 14 z = 58 
2y + 5z = 19 

This system can be written as an augmented matrix, as follows 


" 1 

3 

6 

25 ' 

2 

7 

14 

58 

0 

2 

5 

19 


Notice that it has exactly the same information as the original system. Here it 
derstood that the first column contains the coefficients from x in each equation, in 


1 

2 

0 


Similarly, we create a column from the coefficients on y in each equation, 


is un- 
order, 

" 3 ' 

7 

2 


23 


and a column from the coefficients on 0 in each equation, 


6 

14 

5 


For a system of more 


than three variables, we would continue in this way constructing a column for each variable. 
Similarly, for a system of less than three variables, we simply construct a column for each 
variable. 

" 25 

Finally, we construct a column from the constants of the equations, 58 

. 19 _ 

The rows of the augmented matrix correspond to the equations in the system. For exam- 
ple, the top row in the augmented matrix, [ 1 3 6 | 25 ] corresponds to the equation 


x + 3y + 6z = 25. 


Consider the following definition. 


Definition 1.10: Augmented Matrix of a Linear System 


For a linear system of the form 

anxi + • • • + Oi\ n x n = b\ 

®mlTl T ' ' ' T Qmn%n b m 

where the x t are variables and the a,ij and b, are constants, the augmented matrix of 
this system is given by 


an • 

^1 n 

h ' 


& mn 

bm 


Now, consider elementary operations in the context of the augmented matrix. The el- 
ementary operations in Definition 1.6 can be used on the rows just as we used them on 
equations previously. Changes to a system of equations in as a result of an elementary op- 
eration are equivalent to changes in the augmented matrix resulting from the corresponding 
row operation. Note that Theorem 1.8 implies that any elementary row operations used on 
an augmented matrix will not change the solution to the corresponding system of equations. 
We now formally define elementary row operations. These are the key tool we will use to 
find solutions to systems of equations. 
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Definition 1.11: Elementary Row Operations 


The elementary row operations (also known as row operations) consist of the 
following 

1. Switch two rows. 

2. Multiply a row by a no n zero number. 

3. Replace a row by any multiple of another row added to it. 


Recall how we solved Example 1.9. We can do the exact same steps as above, except now 
in the context of an augmented matrix and using row operations. The augmented matrix of 
this system is 


" 1 

3 

6 

25 ' 

2 

7 

14 

58 

0 

2 

5 

19 


Thus the first step in solving the system given by 1.5 would be to take (—2) times the first 
row of the augmented matrix and add it to the second row, 


" 1 

3 

6 

25 ' 

0 

1 

2 

8 

0 

2 

5 

19 


Note how this corresponds to 1.6. Next take (—2) times the second row and add to the third, 


" 1 

3 

6 

25 " 

0 

1 

2 

8 

0 

0 

1 

3 


This augmented matrix corresponds to the system 

x + 3y + 6z = 25 
y + 2z = 8 
* = 3 

which is the same as 1.7. By back substitution you obtain the solution x — l,y — 6, and 
z — 3. 

Through a systematic procedure of row operations, we can simplify an augmented matrix 
and carry it to row-echelon form or reduced row-echelon form, which we define next. 
These forms are used to find the solutions of the system of equations corresponding to the 
augmented matrix. 

In the following definitions, the term leading entry refers to the first nonzero entry of 
a row when scanning the row from left to right. 
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Definition 1.12: Row-Echelon Form 


An augmented matrix is in row-echelon form if 

1. All nonzero rows are above any rows of zeros. 

2. Each leading entry of a row is in a column to the right of the leading entries of 
any row above it. 

3. Each leading entry of a row is equal to 1. 


We also consider another reduced form of the augmented matrix which has one further 
condition. 



Notice that the first three conditions on a reduced row-echelon form matrix are the same 
as those for row-echelon form. 

Hence, every reduced row-echelon form matrix is also in row-echelon form. The converse 
is not necessarily true; we cannot assume that every matrix in row-echelon form is also in 
reduced row-echelon form. However, it often happens that the row-echelon form is sufficient 
to provide information about the solution of a system. 

The following examples describe matrices in these various forms. As an exercise, take 
the time to carefully verify that they are in the specified form. 



26 


Example 1.15: Matrices in Row-Echelon Form 


The following augmented matrices are in row-echelon form, but not in reduced row- 
echelon form. 


1 0 6 5 8 
0 0 12 7 
0 0 0 0 1 
0 0 0 0 0 


1 3 5 
0 1 0 
0 0 1 
0 0 0 
0 0 0 


1 0 6 
0 14 
0 0 1 
0 0 0 


Notice that we could apply further row operations to these matrices to carry them to 
reduced row-echelon form. Take the time to try that on your own. Consider the following 
matrices, which are in reduced row-echelon form. 


Example 1.16: Matrices in Reduced Row-Echelon Form 


The following augmented matrices are in reduced row-echelon form. 


10 0 5 
0 0 12 
0 0 0 0 


0 0 0 0 0 



"10 0 

1 

o 




0 1 0 

0 


"10 0 

4 ' 

5 

0 0 1 

0 

5 

0 1 0 

3 


0 0 0 

1 


0 0 1 

2 


0 0 0 

0 









One way in which the row-echelon form of a matrix is useful is in identifying the pivot 
positions and pivot columns of the matrix. 


Definition 1.17: Pivot Position and Pivot Column 


A pivot position in a matrix is the location of a leading entry in the row-echelon 
formof a matrix. A pivot column is a column that contains a pivot position. 


For example consider the following. 


Example 1.18: Pivot Position 


Let 



' 1 

2 

3 

4 ' 

A = 

3 

2 

1 

6 


4 

4 

4 

10 


Where are the pivot positions and pivot columns of the augmented matrix A? 


Solution. The row-echelon form of this matrix is 


' 1 

2 

3 

4 ' 

0 

1 

2 

3 

2 

. 0 

0 

0 

0 . 
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This is all we need in this example, but note that this matrix is not in reduced row-echelon 
form. 

In order to identify the pivot positions in the original matrix, we look for the leading 
entries in the row-echelon form of the matrix. Here, the entry in the first row and first 
column, as well as the entry in the second row and second column are the leading entries. 
Hence, these locations are the pivot positions. We identify the pivot positions in the original 
matrix, as in the following: 


'0 2 3 

4 " 

3 m i 

6 

4 4 4 

10 . 


Thus the pivot columns in the matrix are the first two columns. □ 

The following is an algorithm for carrying a matrix to row-echelon form and reduced row- 
echelon form. You may wish to use this algorithm to carry the above matrix to row-echelon 
form or reduced row-echelon form yourself for practice. 


Algorithm 1.19: Reduced Row-Echelon Form Algorithm 


This algorithm provides a method for using row operations to take a matrix to its 
reduced row-echelon form. We begin with the matrix in its original form. 

1. Starting from the left, find the first nonzero column. This is the first pivot 
column , and the position at the top of this column is the first pivot position. 
Switch rows if necessary to place a nonzero number in the first pivot position. 

2. Use row operations to make the entries below the first pivot position (in the first 
pivot column) equal to zero. 

3. Ignoring the row containing the first pivot position, repeat steps 1 and 2 with 
the remaining rows. Repeat the process until there are no more rows to modify. 

4. Divide each nonzero row by the value of the leading entry, so that the leading 
entry becomes 1. The matrix will then be in row-echelon form. 

The following step will carry the matrix from row-echelon form to reduced row-echelon 
form. 

5. Moving from right to left, use row operations to create zeros in the entries of the 
pivot columns which are above the pivot positions. The result will be a matrix 
in reduced row-echelon form. 


Most often we will apply this algorithm to an augmented matrix in order to find the 
solution to a system of linear equations. However, we can use this algorithm to compute the 
reduced row-echelon form of any matrix which could be useful in other applications. 
Consider the following example of Algorithm 1.19. 
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Example 1.20: Finding Row-Echelon Form and 

Reduced Row-Echelon Form of a Matrix 


Let 


A 


0 -5 -4 
14 3 

5 10 7 


Find the row-echelon form of A. Then complete the process until A is in reduced 
row-echelon form. 


Solution. In working through this example, we will use the steps outlined in Algorithm 1.19. 

1. The first pivot column is the first column of the matrix, as this is the first nonzero 
column from the left. Hence the first pivot position is the one in the first row and first 
column. Switch the first two rows to obtain a nonzero entry in the first pivot position, 
outlined in a box below. 

'0 4 3' 

0 -5 -4 
5 HD 7 _ 

2. Step two involves creating zeros in the entries below the first pivot position. The first 
entry of the second row is already a zero. All we need to do is subtract 5 times the 
first row from the third row. The resulting matrix is 

' 1 4 3 

0 -5 -4 
_ 0 hi 8 _ 

3. Now ignore the top row. Apply steps 1 and 2 to the smaller matrix 

" -5 -4' 

10 8 

In this matrix, the first column is a pivot column, and —5 is in the first pivot position. 
Therefore, we need to create a zero below it. To do this, add 2 times the first row (of 
this matrix) to the second. The resulting matrix is 

" -5 -4' 

0 0 

Our original matrix now looks like 

' 1 4 3 " 

0 -5 -4 
0 0 0 

We can see that there are no more rows to modify. 
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4. Now, we need to create leading Is in each row. The first row already has a leading 1 
so no work is needed here. Divide the second row by —5 to create a leading 1. The 
resulting matrix is 

'14 3' 

0 1 | 

. 0 0 0 . 

This matrix is now in row-echelon form. 

5. Now create zeros in the entries above pivot positions in each column, in order to carry 
this matrix all the way to reduced row-echelon form. Notice that there is no pivot 
position in the third column so we do not need to create any zeros in this column! The 
column in which we need to create zeros is the second. To do so, subtract 4 times the 
second row from the first row. The resulting matrix is 

o -r 

0 1 | 

0 0 0 

This matrix is now in reduced row-echelon form. □ 

The above algorithm gives you a simple way to obtain the row-echelon form and reduced 
row-echelon form of a matrix. The main idea is to do row operations in such a way as to end 
up with a matrix in row-echelon form or reduced row-echelon form. This process is important 
because the resulting matrix will allow you to describe the solutions to the corresponding 
linear system of equations in a meaningful way. 

In the next example, we look at how to solve a system of equations using the corresponding 
augmented matrix. 



Solution. The augmented matrix for this system is 


' 2 

4 

-3 

-1 ' 

5 

10 

-7 

-2 

3 

6 

5 

9 


In order to find the solution to this system, we wish to carry the augmented matrix to 
reduced row-echelon form. We will do so using Algorithm 1.19. Notice that the first column 
is nonzero, so this is our first pivot column. The first entry in the first row, 2, is the first 
leading entry and it is in the first pivot position. We will use row operations to create zeros 
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in the entries below the 2. First, replace the second row with —5 times the first row plus 2 
times the second row. This yields 


1 

to 

4 

CO 

1 

I 

1 

0 

0 

1 

1 

3 

6 

5 

9 


Now, replace the third row with —3 times the first row plus to 2 times the third row. This 
yields 


" 2 

4 

-3 

-1 ' 

0 

0 

1 

1 

0 

0 

1 

21 


Now the entries in the first column below the pivot position are zeros. We now look for the 
second pivot column, which in this case is column three. Here, the 1 in the second row and 
third column is in the pivot position. We need to do just one row operation to create a zero 
below the 1. 

Taking —1 times the second row and adding it to the third row yields 


" 2 

4 

-3 

-1 " 

0 

0 

1 

1 

0 

0 

0 

20 


We could proceed with the algorithm to carry this matrix to row-echelon form or reduced 
row-echelon form. However, remember that we are looking for the solutions to the system 
of equations. Take another look at the third row of the matrix. Notice that it corresponds 
to the equation 

Ox + Oy + Oz = 20 

There is no solution to this equation because for all x, y, z, the left side will equal 0 and 
0 20. This shows there is no solution to the given system of equations. In other words, 

this system is inconsistent. □ 

The following is another example of how to find the solution to a system of equations by 
carrying the corresponding augmented matrix to reduced row-echelon form. 



Solution. The augmented matrix of this system is 


3 

-1 

-5 

9 ' 

0 

1 

-10 

0 

-2 

1 

0 

-6 
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In order to find the solution to this system, we will carry the augmented matrix to reduced 
row-echelon form, using Algorithm 1.19. The first column is the first pivot column. We want 
to use row operations to create zeros beneath the first entry in this column, which is in the 
first pivot position. Replace the third row with 2 times the first row added to 3 times the 
third row. This gives 


' 3 

-1 

-5 

9 ' 

0 

1 

-10 

0 

0 

1 

-10 

0 


Now, we have created zeros beneath the 3 in the first column, so we move on to the second 
pivot column (which is the second column) and repeat the procedure. Take —1 times the 
second row and add to the third row. 


" 3 

-1 

-5 

9 ' 

0 

1 

-10 

0 

0 

0 

0 

0 


The entry below the pivot position in the second column is now a zero. Notice that we have 
no more pivot columns because we have only two leading entries. 

At this stage, we also want the leading entries to be equal to one. To do so, divide the 
first row by 3. 


" 1 

i 

3 

5 

3 

3 ' 

0 

1 

-10 

0 

_ 0 

0 

0 

0 _ 


This matrix is now in row-echelon form. 

Let’s continue with row operations until the matrix is in reduced row-echelon form. This 
involves creating zeros above the pivot positions in each pivot column. This requires only 
one step, which is to add | times the second row to the first row. 


" 1 

0 

-5 

3 " 

0 

1 

-10 

0 

0 

0 

0 

0 


This is in reduced row-echelon form, which you should verify using Definition 1.13. The 
equations corresponding to this reduced row-echelon form are 

x — 5z = 3 
y — 1C )z = 0 

or 

x = 3 + 5z 
y = 1C )z 

Observe that z is not restrained by any equation. In fact, £ can equal any number. For 
example, we can let z = t, where we can choose t to be any number. In this context t is 
called a parameter . Therefore, the solution set of this system is 

x = 3 + 5t 
y = lOt 
z = t 
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where t is arbitrary. The system has an infinite set of solutions which are given by these 
equations. For any value of t we select, x, y, and z will be given by the above equations. For 
example, if we choose t — 4 then the corresponding solution would be 

x = 3 + 5(4) = 23 
y = 10(4) = 40 
2 = 4 

□ 


In Example 1.22 the solution involved one parameter. It may happen that the solution 
to a system involves more than one parameter, as shown in the following example. 



Solution. The augmented matrix is 


12-11 

11-11 

13-11 


3 

1 

5 


We wish to carry this matrix to row-echelon form. Here, we will outline the row operations 
used. However, make sure that you understand the steps in terms of Algorithm 1.19. 

Take —1 times the first row and add to the second. Then take —1 times the first row 
and add to the third. This yields 


" 1 

2 

-1 

1 

1 

CO 

0 

-1 

0 

0 

-2 

0 

1 

0 

0 

2 


Now add the second row to the third row and divide the second row by — 1. 


' 1 

2 

-1 

1 

1 

CO 

0 

1 

0 

0 

2 

1 

o 

0 

0 

0 

1 

o 


(1.9) 


This matrix is in row-echelon form and we can see that x and y correspond to pivot 
columns, while z and w do not. Therefore, we will assign parameters to the variables z and 
w. Assign the parameter s to z and the parameter t to w. Then the first row yields the 
equation x + 2y — s + t = 3, while the second row yields the equation y = 2. Since y — 2, 
the Erst equation becomes x + 4 — s + t = 3 showing that the solution is given by 


x = — 1 + s — t 

y = 2 

z = s 
w — t 
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It is customary to write this solution in the form 


X 


— 1 + s — t 

y 


2 

z 


s 

w 


t 


( 1 . 10 ) 

0 


This example shows a system of equations with an infinite solution set which depends on 
two parameters. It can be less confusing in the case of an infinite solution set to first place 
the augmented matrix in reduced row-echelon form rather than just row-echelon form before 
seeking to write down the description of the solution. 

In the above steps, this means we don’t stop with the row-echelon form in equation 1.9. 
Instead we first place it in reduced row-echelon form as follows. 


1 

0 

-1 

1 

1 

T -— 1 

1 

0 

1 

0 

0 

2 

0 

0 

0 

0 

1 

O 


Then the solution is y = 2 from the second row and x = — 1 + z — w from the first. Thus 
letting z = s and w = t, the solution is given by 1.10. 

You can see here that there are two paths to the correct answer, which both yield the 
same answer. Hence, either approach may be used. The process which we first used in the 
above solution is called Gaussian Elimination This process involves carrying the matrix 
to row-echelon form, converting back to equations, and using back substitution to find the 
solution. When you do row operations until you obtain reduced row-echelon form, the process 
is called Gauss- Jordan Elimination. 

We have now found solutions for systems of equations with no solution and infinitely 
many solutions, with one parameter as well as two parameters. Recall the three types of 
solution sets which we discussed in the previous section; no solution, one solution, and 
infinitely many solutions. Each of these types of solutions could be identified from the graph 
of the system. It turns out that we can also identify the type of solution from the reduced 
row-echelon form of the augmented matrix. 

• No Solution: In the case where the system of equations has no solution, the row- 
echelon form of the augmented matrix will have a row of the form 

[ 0 0 0 I 1 ] 

This row indicates that the system is inconsistent and has no solution. 

• One Solution: In the case where the system of equations has one solution, every 
column of the coefficient matrix is a pivot column. The following is an example of 
an augmented matrix in reduced row-echelon form for a system of equations with one 
solution. 


1 

0 

0 

5 ' 

0 

1 

0 

0 

o 

1 

0 

1 

2 
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• Infinitely Many Solutions: In the case where the system of equations has infinitely 
many solutions, the solution contains parameters. There will be columns of the coef- 
ficient matrix which are not pivot columns. The following are examples of augmented 
matrices in reduced row-echelon form for systems of equations with infinitely many 
solutions. 


1 

0 

0 

5 ' 

0 

1 

2 

-3 

0 

0 

0 

0 

' 1 

0 

0 

5 ' 

1 

0 

1 

0 

-3 


1.2.3. Uniqueness of the Reduced Row-Echelon Form 

As we have seen in earlier sections, we know that every matrix can be brought into reduced 
row-echelon form by a sequence of elementary row operations. Here we will prove that 
the resulting matrix is unique; in other words, the resulting matrix in reduced row-echelon 
form does not depend upon the particular sequence of elementary row operations or the order 
in which they were performed. 

Let A be the augmented matrix of a homogeneous system of linear equations in the 
variables aq, aq, • • • , x n which is also in reduced row-echelon form. The matrix A divides the 
set of variables in two different types. We say that aq is a basic variable whenever A has a 
leading 1 in column number i, in other words, when column i is a pivot column. Otherwise 
we say that aq is a free variable. 

Recall Example 1.23. 



Solution. Recall from the solution of Example 1.23 that the row-echelon form of the aug- 
mented matrix of this system is given by 


1 

2 

-1 

1 

1 

CO 

0 

1 

0 

0 

2 

O 

1 

0 

0 

0 

1 

O 


You can see that columns 1 and 2 are pivot columns. These columns correspond to variables 
x and y, making these the basic variables. Columns 3 and 4 are not pivot columns, which 
means that z and w are free variables. 
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We can write the solution to this system as 

x = — 1 + s — t 

y = 2 

z = s 
w — t 

Here the free variables are written as parameters, and the basic variables are given by 
linear functions of these parameters. □ 

In general, all solutions can be written in terms of the free variables. In such a description, 
the free variables can take any values (they become parameters), while the basic variables 
become simple linear functions of these parameters. Indeed, a basic variable Xi is a linear 
function of only those free variables Xj with j > i. This leads to the following observation. 


Proposition 1.25: Basic and Free Variables 


If Xi is a basic variable of a homogeneous system of linear equations, then any solution 
of the system with x 3 = 0 for all those free variables Xj with j > i must also have 

Xi = 0 . 


Using this proposition, we prove a lemma which will be used in the proof of the main 
result of this section below. 


Lemma 1.26: Solutions and the Reduced Row-Echelon Form of a Matrix 


Let A and B be two distinct augmented matrices for two homogeneous systems of m 
equations in n variables, such that A and B are each in reduced row-echelon form. 
Then, the two systems do not have exactly the same solutions. 


Proof. With respect to the linear systems associated with the matrices A and B, there are 
two cases to consider: 

• Case 1: the two systems have the same basic variables 

• Case 2: the two systems do not have the same basic variables 

In case 1, the two matrices will have exactly the same pivot positions. However, since A and 
B are not identical, there is some row of A which is different from the corresponding row of 
B and yet the rows each have a pivot in the same column position. Let i be the index of 
this column position. Since the matrices are in reduced row-echelon form, the two rows must 
differ at some entry in a column j > i. Let these entries be a in A and b in B, where a ^ b. 
Since A is in reduced row-echelon form, if Xj were a basic variable for its linear system, we 
would have a = 0. Similarly, if x 3 were a basic variable for the linear system of the matrix B , 
we would have 6 = 0. Since a and 6 are unequal, they cannot both be equal to 0, and hence 
Xj cannot be a basic variable for both linear systems. However, since the systems have the 
same basic variables, x 3 must then be a free variable for each system. We now look at the 
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solutions of the systems in which Xj is set equal to 1 and all other free variables are set equal 
to 0. For this choice of parameters, the solution of the system for matrix A has Xj = —a, 
while the solution of the system for matrix B has Xj = —b, so that the two systems have 
different solutions. 

In case 2, there is a variable Xi which is a basic variable for one matrix, let’s say A, and 
a free variable for the other matrix B. The system for matrix B has a solution in which 
Xi = 1 and Xj = 0 for all other free variables Xj. However, by Proposition 1.25 this cannot 
be a solution of the system for the matrix A. This completes the proof of case 2. □ 

Now, we say that the matrix B is equivalent to the matrix A provided that B can be 
obtained from A by performing a sequence of elementary row operations beginning with A. 
The importance of this concept lies in the following result. 


Theorem 1.27: Equivalent Matrices 


The two linear systems of equations corresponding to two equivalent augmented ma- 
trices have exactly the same solutions. 


The proof of this theorem is left as an exercise. 

Now, we can use Lemma 1.26 and Theorem 1.27 to prove the main result of this section. 


Theorem 1.28: Uniqueness of the Reduced Row-Echelon Form 


Every matrix A is equivalent to a unique matrix in reduced row-echelon form. 


Proof. Let A be an m x n matrix and let B and C be matrices in reduced row-echelon form, 
each equivalent to A. It suffices to show that B = C. 

Let A + be the matrix A augmented with a new rightmost column consisting entirely of 
zeros. Similarly, augment matrices B and C each with a rightmost column of zeros to obtain 
B + and C + . Note that B + and C + are matrices in reduced row-echelon form which are 
obtained from A + by respectively applying the same sequence of elementary row operations 
which were used to obtain B and C from A. 

Now, A + , B + , and C + can all be considered as augmented matrices of homogeneous 
linear systems in the variables X\,X 2 , *• • ,x n . Because B + and C + are each equivalent to 
A + , Theorem 1.27 ensures that all three homogeneous linear systems have exactly the same 
solutions. By Lemma 1.26 we conclude that B + = C + . By construction, we must also have 

b — c. n 

According to this theorem we can say that each matrix A has a unique reduced row- 
echelon form. 


1.2.4. Rank and Homogeneous Systems 


There is a special type of system which requires additional study. This type of system is 
called a homogeneous system of equations, which we defined above in Definition 1.3. Our 
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focus in this section is to consider what types of solutions are possible for a homogeneous 
system of equations. 

Consider the following definition. 


Definition 1.29: Trivial Solution 

^ 

Consider the homogeneous system of equations given by 

a n xi + cl 12X2 + • • 

H - Q'lnZ'n 0 

CL21X1 + CL22X2 + • • 

+ @2 n%n 0 

+ «m 2^2 + ' ' 

+ ^mrv^'n 0 

Then, aq — 0, X2 — 0, ■ • • , x n = 0 is always a 

trivial solution . 

solution to this system. We call this the 


If the system has a solution in which not all of the aq, • • ■ , x n are equal to zero, then we 
call this solution nontrivial . The trivial solution does not tell us much about the system, 
as it says that 0 = 0! Therefore, when working with homogeneous systems of equations, we 
want to know when the system has a nontrivial solution. 

Suppose we have a homogeneous system of m equations, using n variables, and suppose 
that n > 77i. In other words, there are more variables than equations. Then, it turns out 
that this system always has a nontrivial solution. Not only will the system have a nontrivial 
solution, but it also will have infinitely many solutions. It is also possible, but not required, 
to have a nontrivial solution if n — m and n < m. 

Consider the following example. 


Example 1.30: Solutions to a Homogeneous System of Equations 


Find the nontrivial solutions to the following homogeneous system of equations 

2x + y — z = 0 
x + 2y — 2z = 0 


Solution. Notice that this system has m = 2 equations and n = 3 variables, so n > m. 
Therefore by our previous discussion, we expect this system to have infinitely many solutions. 

The process we use to find the solutions for a homogeneous system of equations is the 
same process we used in the previous section. First, we construct the augmented matrix, 
given by 


— i 

to 

i 

l 

o 

12-2 

o 

1 


Then, we carry this matrix to its reduced row-echelon form, given below. 


' 1 

0 

0 

i 

o 

1 

o 

1 

-i 

o 

1 
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The corresponding system of equations is 


x = 0 

y - z = o 

Since 0 is not restrained by any equation, we know that this variable will become our pa- 
rameter. Let z — t where t is any number. Therefore, our solution has the form 


x = 0 
y — z — t 
z = t 


Hence this system has infinitely many solutions, with one parameter t. 


□ 


Suppose we were to write the solution to the previous example in another form. Specifi- 
cally, 

x = 0 

y = 0 + t 

z = 0 + t 

can be written as 


X 


" 0 ' 


" 0 ' 

y 

= 

0 

+ t 

1 

z 


0 


1 


Notice that we have constructed a column from the constants in the solution (all equal to 
0), as well as a column corresponding to the coefficients on t in each equation. While we 
will discuss this form of solution more in further chapters, for now consider the column of 

" 0 

coefficients of the parameter t. In this case, this is the column 1 

1 _ 

There is a special name for this column, which is basic solution. The basic solutions 
of a system are columns constructed from the coefficients on parameters in the solution. 
We often denote basic solutions by X \ , X 2 etc., depending on how many solutions occur. 

" 0 

Therefore, Example 1.30 has the basic solution A"! = 


We explore this further in the following example. 


Example 1.31: Basic Solutions of a Homogeneous System 


Consider the following homogeneous system of equations. 

x + 4y + 3z = 0 
3x + 12 y + 9^ = 0 

Find the basic solutions to this system. 
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Solution. The augmented matrix of this system and the resulting reduced row-echelon 
form are 


14 3 
3 12 9 


14 3 
0 0 0 


When written in equations, this system is given by 

x + Ay + 3z = 0 

Notice that only x corresponds to a pivot column. In this case, we will have two parameters, 
one for y and one for z. Let y = s and z — t for any numbers s and t. Then, our solution 
becomes 

x = —4s — 3 1 
y = s 
z = t 

which can be written as 


X 


" 0 ' 


" -4 ' 


" -3 ' 

y 

z 

— 

1 

o o 

+ s 

1 

0 

+ t 

0 

1 


You can see here that we have two columns of coefficients corresponding to parameters, 
specifically one for s and one for t. Therefore, this system has two basic solutions! These 
are 


X 1 = 

1 

^ T-H 

1 

* 

to 

1 

CO O 

1 


1 

o 

l 


1 


□ 


We now present a new definition. 



A remarkable result of this section is that a linear combination of the basic solutions is 
again a solution to the system. Even more remarkable is that every solution can be written 
as a linear combination of these solutions. Therefore, if we take a linear combination of the 
two solutions to Example 1.31, this would also be a solution. For example, we could take 
the following linear combination 


" -4 ' 


" -3 ' 


" -18 ' 

1 

+ 2 

0 

= 

3 

0 


1 


2 
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You should take a moment to verify that 


X 


" -18 ' 

y 

z 

— 

3 

2 


is in fact a solution to the system in Example 1.31. 

Another way in which we can find out more information about the solutions of a homo- 
geneous system is to consider the rank of the associated coefficient matrix. We now define 
what is meant by the rank of a matrix. 


Definition 1.33: Rank of a Matrix 


Let A be a matrix and consider any row-echelon form of A. Then, the number r of 
leading entries of A does not depend on the row-echelon form you choose, and is called 
the rank of A. We denote it by rank(A) . 


Similarly, we could count the number of pivot positions (or pivot columns) to determine 
the rank of A. 



Solution. First, we need to find the reduced row-echelon form of A. Through the usual 
algorithm, we find that this is 

'0 0 - 1 ' 

0 0 2 

0 0 0 _ 

Here we have two leading entries, or two pivot positions, shown above in boxes. The rank of 
A is r = 2. □ 

Notice that we would have achieved the same answer if we had found the row-echelon 
form of A instead of the reduced row-echelon form. 

Suppose we have a homogeneous system of m equations in n variables, and suppose that 
n > m. From our above discussion, we know that this system will have infinitely many 
solutions. If we consider the rank of the coefficient matrix of this system, we can End out 
even more about the solution. Note that we are looking at just the coefficient matrix, not 
the entire augmented matrix. 
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Theorem 1.35: Rank and Solutions to a Homogeneous System 


Let A be the m x n coefficient matrix corresponding to a homogeneous system of 
equations, and suppose A has rank r. Then, the solution to the corresponding system 
has n — r parameters. 


Consider our above Example 1.31 in the context of this theorem. The system in this 
example has m — 2 equations in n = 3 variables. First, because n > m, we know that the 
system has a nontrivial solution, and therefore infinitely many solutions. This tells us that 
the solution will contain at least one parameter. The rank of the coefficient matrix can tell 
us even more about the solution! The rank of the coefficient matrix of the system is 1, as it 
has one leading entry in row-echelon form. Theorem 1.35 tells us that the solution will have 
n — r = 3 — 1 = 2 parameters. You can check that this is true in the solution to Example 
1.31. 

Notice that if n — m or n < m , it is possible to have either a unique solution (which will 
be the trivial solution) or infinitely many solutions. 

We are not limited to homogeneous systems of equations here. The rank of a matrix 
can be used to learn about the solutions of any system of linear equations. In the previous 
section, we discussed that a system of equations can have no solution, a unique solution, 
or infinitely many solutions. Suppose the system is consistent, whether it is homogeneous 
or not. The following theorem tells us how we can use the rank to learn about the type of 
solution we have. 


Theorem 1.36: Rank and Solutions to a Consistent System of Equations 


Let A be the m x (n + 1) augmented matrix corresponding to a consistent system of 
equations in n variables, and suppose A has rank r. Then 

1. the system has a unique solution if r = n 

2. the system has infinitely many solutions if r < n 


We will not present a formal proof of this, but consider the following discussions. 

1. No Solution The above theorem assumes that the system is consistent, that is, that 
it has a solution. It turns out that it is possible for the augmented matrix of a system 
with no solution to have any rank r as long as r > 1. Therefore, we must know that 
the system is consistent in order to use this theorem! 

2. Unique Solution Suppose r = n. Then, there is a pivot position in every column of 
the coefficient matrix of A. Hence, there is a unique solution. 

3. Infinitely Many Solutions Suppose r < n. Then there are infinitely many solutions. 
There are less pivot positions (and hence less leading entries) than columns, meaning 
that not every column is a pivot column. The columns which are not pivot columns 
correspond to parameters. In fact, in this case we have n — r parameters. 
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1.2.5. Balancing Chemical Reactions 


The tools of linear algebra can also be used in the subject area of Chemistry, specifically for 
balancing chemical reactions. 

Consider the chemical reaction 

Sn0 2 + H 2 ->• Sn + H 2 0 

Here the elements involved are tin (Sn), oxygen ( O ), and hydrogen (H). A chemical reaction 
occurs and the result is a combination of tin (Sn) and water (H 2 0). When considering 
chemical reactions, we want to investigate how much of each element we began with and 
how much of each element is involved in the result. 

An important theory we will use here is the mass balance theory. It tells us that we 
cannot create or delete elements within a chemical reaction. For example, in the above 
expression, we must have the same number of oxygen, tin, and hydrogen on both sides of 
the reaction. Notice that this is not currently the case. For example, there are two oxygen 
atoms on the left and only one on the right. In order to fix this, we want to find numbers 
x,y,z,w such that 

xSn0 2 + yH 2 — > zSn + wH 2 0 

where both sides of the reaction have the same number of atoms of the various elements. 

This is a familiar problem. We can solve it by setting up a system of equations in the 
variables x,y,z,w. Thus you need 



Sn : 

: x = 

= Z 



0 : 

2x 

= W 


H : 

2 y 

= 2 w 

We can rewrite these equations as 





Sn: 

x — z = 0 


0 : 

2x — 

w = 

: 0 


H : 

2 y- 

2 W = 

= 0 

The augmented matrix for this system 

of equations 

is given by 

“ 

1 0 

-1 

0 

0 ' 

' 


2 0 

0 

-1 

0 



0 2 

0 

-2 

0 


The reduced row-echelon form of 

this matrix is 



' 1 0 

0 

1 

2 

0 ' 



0 1 

0 

-1 

0 



0 0 

1 

1 

2 

0 


The solution is given by 






x - 

- \w 

= 0 



y 

— w - 

= 0 



z - 

- bw 

= 0 
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which we can write as 


x = \i 

y = t 
z = \t 

w — t 

For example, let w — 2 and this would yield x = l,y = 2, and z — 1. We can put these 
values back into the expression for the reaction which yields 

Sn0 2 + 2H 2 -> Sn + 2 H 2 0 

Observe that each side of the expression contains the same number of atoms of each element. 
This means that it preserves the total number of atoms, as required, and so the chemical 
reaction is balanced. 

Consider another example. 


Example 1.37: Balancing a Chemical Reaction 


Potassium is denoted by K, oxygen by O, phosphorus by P and hydrogen by H. 
Consider the reaction given by 

KOH + H 3 P0 4 -> K 3 P0 4 + H 2 0 

Balance this chemical reaction. 


Solution. We will use the same procedure as above to solve this problem. We need to find 
values for x, y, z, w such that 

xKOH + yH 3 P0 4 -> zK 3 PO A + iuH 2 0 


preserves the total number of atoms of each element. 

Finding these values can be done by finding the solution to the following system of 
equations. 

K : x — 3z 
0 : x + Ay = Az + w 
H : x + 3y = 2w 
P : y = z 

The augmented matrix for this system is 


1 

0 

-3 

0 

1 

O 

1 

4 

-4 

-1 

0 

1 

3 

0 

-2 

0 

0 

1 

-1 

0 

0 


and the reduced row-echelon form is 


1 

0 

0 

-1 

1 

o 

0 

1 

0 

1 

3 

0 

0 

0 

1 

1 

3 

0 

o 

1 

0 

0 

0 

1 

o 
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The solution is given by 


x — w = 0 
y — = 0 
z — = 0 

which can be written as 

x = t 

y = ¥ 

z = \t 
w — t 

Choose a value for t, say 3. Then w = 3 and this yields x — 3, y — 1, z — 1. It follows 
that the balanced reaction is given by 

3 KOH + lH :i P() 4 -> 1K 3 P0 4 + 3H 2 0 

Note that this results in the same number of atoms on both sides. □ 

Of course these numbers you are finding would typically be the number of moles of the 
molecules on each side. Thus three moles of KOH added to one mole of H 3 P0 4 yields one 
mole of K 3 P0 4 and three moles of H 2 0. 

1.2.6. Dimensionless Variables 


This section shows how solving systems of equations can be used to determine appropriate 
dimensionless variables. It is only an introduction to this topic and considers a specific 
example of a simple airplane wing shown below. We assume for simplicity that it is a flat 
plane at an angle to the wind which is blowing against it with speed V as shown. 



The angle 6 is called the angle of incidence, B is the span of the wing and A is called 
the chord. Denote by l the lift. Then this should depend on various quantities like 9, V, B , A 
and so forth. Here is a table which indicates various quantities on which it is reasonable to 
expect l to depend. 


Variable 

Symbol 

Units 

chord 

A 

m 

span 

B 

m 

angle incidence 

9 

m°kg° sec 0 

speed of wind 

V 

msec -1 

speed of sound 

Vo 

msec -1 

density of air 

P 

kgm ~ 3 

viscosity 

P 

kg sec -1 m -1 

lift 

l 

kg sec -2 m 
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Here m denotes meters, sec refers to seconds and kg refers to kilograms. All of these are 
likely familiar except for // , which we will discuss in further detail now. 

Viscosity is a measure of how much internal friction is experienced when the fluid moves. 
It is roughly a measure of how “sticky” the fluid is. Consider a piece of area parallel to the 
direction of motion of the fluid. To say that the viscosity is large is to say that the tangential 
force applied to this area must be large in order to achieve a given change in speed of the 
fluid in a direction normal to the tangential force. Thus 


Hence 


g (area) (velocity gradient) 


(units on /i) m 2 


m 


sec m 


tangential force 
= kg sec' 2 m 


Thus the units on g are 

kg sec -1 m _1 

as claimed above. 

Returning to our original discussion, you may think that we would want 

l — f (A, B, 6, V, V 0 , p, g) 


This is very cumbersome because it depends on seven variables. Also, it is likely that without 
much care, a change in the units such as going from meters to feet would result in an incorrect 
value for /. The way to get around this problem is to look for / as a function of dimensionless 
variables multiplied by something which has units of force. It is helpful because first of all, 
you will likely have fewer independent variables and secondly, you could expect the formula 
to hold independent of the way of specifying length, mass and so forth. One looks for 

l = f (gi, ■ • ' ,9k) pV 2 AB 

where the units on pV 2 AB are 

kg ( m\ 2 2 kg x m 

— 3 ( ) m = ~ 

m 6 V sec / sem 

which are the units of force. Each of these g * is of the form 

A x 'B X2 6 X3 V Xi V* 5 ff 6 i-i X7 ( 1 . 11 ) 

and each g t is independent of the dimensions. That is, this expression must not depend on 
meters, kilograms, seconds, etc. Thus, placing in the units for each of these quantities, one 
needs 


rn Xl rn X2 (m Xi sec Xi ) (m X5 sec x 5 ) ( kgm 3 )' t6 [kg sec l m 1 ) a:7 = m°kg° sec 0 

Notice that there are no units on 6 because it is just the radian measure of an angle. Hence 
its dimensions consist of length divided by length, thus it is dimensionless. Then this leads 
to the following equations for the x t . 

m : x\ + X2 + £4 + X5 — 3 x 6 — x-j = 0 
sec : —X4 — x^ — x-j = 0 

kg : X 6 + X 7 = 0 
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The augmented matrix for this system is 


1 

1 

0 

1 

1 

-3 

-1 

1 

0 

0 

0 

0 

1 

1 

0 

1 

0 

0 

0 

0 

0 

0 

1 

1 

1 

0 


The reduced row-echelon form is given by 


1 

1 

0 

0 

0 

0 

1 

1 

0 

0 

0 

0 

1 

1 

0 

1 

0 

0 

0 

0 

0 

0 

1 

1 

0 _ 


and so the solutions are of the form 


X\ = —x 2 - x 7 

x 3 = x 3 

X4 = —£5 — x 7 

x 6 = —x 7 

Thus, in terms of vectors, the solution is 


X\ 


-£2 - £7 

X 2 


x 2 

x 3 


x 3 

£4 

= 

-£ 5 - £7 

£5 


£5 

£ 6 


-£ 7 

. . 




Thus the free variables are x 2 ,x 3 ,x 5 ,x 7 . By assigning values to these, we can obtain dimen- 
sionless variables by placing the values obtained for the x % in the formula 1.11. For example, 
let x 2 — 1 and all the rest of the free variables are 0. This yields 


X\ = — 1, x 2 = 1, x 3 = 0, X4 = 0, x 5 = 0, x 6 = 0, x 7 = 0 

The dimensionless variable is then A~ 1 B 1 . This is the ratio between the span and the chord. 
It is called the aspect ratio, denoted as AR. Next let £3 = 1 and all others equal zero. This 
gives for a dimensionless quantity the angle 6 . Next let £5 = 1 and all others equal zero. 
This gives 

X\ = 0, x 2 = 0, x 3 = 0, £4 = —1, £5 = 1, £ 6 = 0, £7 = 0 

Then the dimensionless variable is V^Vq. However, it is written as V/Vq. This is called the 
Mach number AT Finally, let £7 = 1 and all the other free variables equal 0. Then 


£l = — 1 , £2 = 0, £3 = 0, £4 = — 1 , £5 = 0, £6 = — 1 , £7=1 

then the dimensionless variable which results from this is A~ l V~ x p~ l [i. It is customary to 
write it as Re = ( AVp ) / p. This one is called the Reynold’s number. It is the one which 
involves viscosity. Thus we would look for 

l = f (Re, AR, 6, M) kg x m/ sec 2 
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This is quite interesting because it is easy to vary Re by simply adjusting the velocity or 
A but it is hard to vary things like /i or p. Note that all the quantities are easy to adjust. 
Now this could be used, along with wind tunnel experiments to get a formula for the lift 
which would be reasonable. You could also consider more variables and more complicated 
situations in the same way. 

1.2.7. Exercises 

1. Find the point (aq, yi) which lies on both lines, x + 3y = 1 and 4x — y = 3. 

2. Find the point of intersection of the two lines 3a; + y = 3 and x + 2y = 1. 

3. Do the three lines, x + 2y — 1, 2a; — y — 1, and 4a; + 3y = 3 have a common point of 
intersection? If so, find the point and if not, tell why they don’t have such a common 
point of intersection. 

4. Do the three planes, x + y — 3z = 2, 2 x + y + z = 1, and 3a; + 2y — 2z = 0 have 

a common point of intersection? If so, find one and if not, tell why there is no such 

point. 

5. Four times the weight of Gaston is 150 pounds more than the weight of Ichabod. 
Four times the weight of Ichabod is 660 pounds less than seventeen times the weight 
of Gaston. Four times the weight of Gaston plus the weight of Siegfried equals 290 
pounds. Brunhilde would balance all three of the others. Find the weights of the four 
people. 

6. Consider the following augmented matrix in which * denotes an arbitrary number 
and ■ denotes a nonzero number. Determine whether the given augmented matrix is 
consistent. If consistent, is the solution unique? 


' ■ 

* 

* 

* 

* 

* 

0 

■ 

* 

* 

0 

* 

0 

0 

■ 

* 

* 

* 

0 

0 

0 

0 

■ 

* 


7. Consider the following augmented matrix in which * denotes an arbitrary number 
and ■ denotes a nonzero number. Determine whether the given augmented matrix is 
consistent. If consistent, is the solution unique? 


' ■ 

* 

* 

* 

0 

■ 

* 

* 

0 

0 

■ 

* 


8. Consider the following augmented matrix in which * denotes an arbitrary number 
and ■ denotes a nonzero number. Determine whether the given augmented matrix is 


48 



consistent. If consistent, is the solution unique? 


' ■ 

* 

* 

* 

* 

* 

0 

■ 

0 

* 

0 

* 

0 

0 

0 

■ 

* 

* 

0 

0 

0 

0 

■ 

* 


9. Consider the following augmented matrix in which * denotes an arbitrary number 
and ■ denotes a nonzero number. Determine whether the given augmented matrix is 
consistent. If consistent, is the solution unique? 


' ■ 

* 

* 

* 

* 

* 

0 

■ 

* 

* 

0 

* 

0 

0 

0 

0 

■ 

0 

0 

0 

0 

0 

* 

■ 


10. Suppose a system of equations has fewer equations than variables. Will such a system 
necessarily be consistent? If so, explain why and if not, give an example which is not 
consistent. 

11. If a system of equations has more equations than variables, can it have a solution? If 
so, give an example and if not, tell why not. 


12. Find h such that 


' 2 h 

l 

3 6 

l 


is the augmented matrix of an inconsistent system. 


13. Find h such that 


r i h 

i 

CO 

i — 

tO 

a> 

i 


is the augmented matrix of a consistent system. 


14. Find h such that 


ri i 

4 ' 

i — 

CO 

12 


is the augmented matrix of a consistent system. 


15. Choose h and k such that the augmented matrix shown has each of the following: 

(a) one solution 

(b) no solution 

(c) infinitely many solutions 


r i h 

to 

i 

i — 

to 

4 ^ 

k\ 
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16. Choose h and k such that the augmented matrix shown has each of the following: 


(a) one solution 

(b) no solution 

(c) infinitely many solutions 


r i 2 

to 

1 

i 

bO 

1 


17. Determine if the system is consistent. If so, is the solution unique? 

x + 2y + z — w = 2 
x — y + z + w = 1 
2x + y — z = 1 
Ax + 2y + z = 5 

18. Determine if the system is consistent. If so, is the solution unique? 

x + 2y + z — w = 2 
x — y + z + w = 0 
2x + y — z = 1 
Ax + 2y + z = 3 

19. Determine which matrices are in reduced row-echelon form. 


(a) 

' 1 

2 

0 ' 




0 

1 

7 





' 1 

0 

0 

0 ' 



(b) 

0 

0 

1 

2 




_ 0 

0 

0 

0 _ 




' 1 

1 

0 

0 

0 

5 

(c) 

0 

0 

1 

2 

0 

4 


0 

0 

0 

0 

1 

3 


20. Row reduce the following matrix to obtain the row-echelon form. Then continue to 
obtain the reduced row-echelon form. 

'2 -1 3 -1 ' 

10 2 1 
1 - 11-2 


21. Row reduce the following matrix to obtain the row-echelon form. Then continue to 
obtain the reduced row-echelon form. 

"0 0 -1 -1 ' 

1110 
11 0-1 
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22. Row reduce the following matrix to obtain the row-echelon form. Then continue to 
obtain the reduced row-echelon form. 

'3 -6 -7 -8 " 

1 -2 -2 -2 

1 -2 -3 -4 


23. Row reduce the following matrix to obtain the row-echelon form. Then continue to 
obtain the reduced row-echelon form. 

' 2 4 5 15 ' 

12 3 9 

12 2 6 


24. Row reduce the following matrix to obtain the row-echelon form. Then continue to 
obtain the reduced row-echelon form. 

'4-1 7 10' 

10 3 3 
1 - 1-2 1 


25. Row reduce the following matrix to obtain the row-echelon form. Then continue to 
obtain the reduced row-echelon form. 

'3 5 -4 2 
12-11 
11-20 


26. Row reduce the following matrix to obtain the row-echelon form. Then continue to 
obtain the reduced row-echelon form. 

" -2 3 -8 7 " 

1-2 5-5 

1-3 7-8 


27. Find the solution of the system whose augmented matrix is 


1 

to 

o 

1 

CN 

13 4 

2 

10 2 

1 
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28. Find the solution of the system whose augmented matrix is 


' 1 

2 

0 

1 

CM 

2 

0 

1 

1 

3 

2 

1 

1 

CO 


29. Find the solution of the system whose augmented matrix is 


[110 

11 

1 

o 

bO 

1 


30. Find the solution of the system whose augmented matrix is 


' 1 

0 

2 

1 

1 

1 

CM 

0 

i 

0 

1 

2 

1 

1 

2 

0 

0 

1 

3 

1 

0 

1 

0 

2 

2 


31. Find the solution of the system whose augmented matrix is 


' 1 

0 

2 

1 

1 

2 ' 

0 

1 

0 

1 

2 

1 

0 

2 

0 

0 

1 

3 

1 

-1 

2 

2 

2 

0 


32. Find the solution to the system of equations, 7x + lAy + 152; = 22, 2x + Ay + 3z = 5, 
and 3x + 6y + 10 z = 13. 

33. Find the solution to the system of equations, 3x — y + Az = 6, y + 8z = 0, and 
—2x + y = —A. 

34. Find the solution to the system of equations, 9x — 2y + Az = —17, 13x — 3y + 6z = —25, 
and —2x — z = 3. 

35. Find the solution to the system of equations, 65a; + 84?/ + lOz = 546, 81a; + 105?/ + 202 = 
682, and 84a; + 1 1C )y + 21 z = 713. 

36. Find the solution to the system of equations, 8x + 2y + 3z = —3, 8x + 3y + 3z = — 1, 
and Ax + y + 3z = —9. 

37. Find the solution to the system of equations, —8a; + 2y + 5z = 18, —8a; + 3y + 5z = 13, 
and —4a; + y 4- 5z — 19. 

38. Find the solution to the system of equations, 3x — y — 2z = 3, y — Az = 0, and 
—2x + y = —2. 

39. Find the solution to the system of equations, —9a; + 15 y = 66, —11a; + 18 y = 79, 
—x + y = 4, and z — 3. 
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40. Find the solution to the system of equations, —19a; + 8 y = —108, —71a; + 30 y = —404, 
—2x + y — —12, 4 x + z = 14. 

41. Suppose a system of equations has fewer equations than variables and you have found 
a solution to this system of equations. Is it possible that your solution is the only one? 
Explain. 

42. Suppose a system of linear equations has a 2 x 4 augmented matrix and the last column 
is a pivot column. Could the system of linear equations be consistent? Explain. 

43. Suppose the coefficient matrix of a system of n equations with n variables has the 
property that every column is a pivot column. Does it follow that the system of 
equations must have a solution? If so, must the solution be unique? Explain. 

44. Suppose there is a unique solution to a system of linear equations. What must be true 
of the pivot columns in the augmented matrix? 

45. The steady state temperature, u, of a plate solves Laplace’s equation, A u = 0. One way 
to approximate the solution is to divide the plate into a square mesh and require the 
temperature at each node to equal the average of the temperature at the four adjacent 
nodes. In the following picture, the numbers represent the observed temperature at 
the indicated nodes. Find the temperature at the interior nodes, indicated by x,y,z, 
and w. One of the equations is z = | (10 + 0 + w + x). 

0 
0 

46. Consider the following diagram of four circuits. 


3 fl 20 volt si Q 



The jagged lines denote resistors and the numbers next to them give their resistance 
in ohms, written as 0. The breaks in the lines having one short line and one long 
line denote a voltage source which causes the current to flow in the direction which 
goes from the longer of the two lines toward the shorter along the unbroken part of 
the circuit. The current in amps in the four circuits is denoted by Ji,/ 2 ,/ 3,/4 and 
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it is understood that the motion is in the counter clockwise direction. If I \ ends up 
being negative, then it just means the current flows in the clockwise direction. Then 
Kirchhoff’s law states: 

The sum of the resistance times the amps in the counter clockwise direction around a 
loop equals the sum of the voltage sources in the same direction around the loop. 

In the above diagram, the top left circuit should give the equation 

2/2 — 2 /i + 5/2 — 5/3 + 3/2 = 5 
For the circuit on the lower left, you should have 

AIi + h — I 4 + 2 1\ — 2/2 = — 10 

Write equations for each of the other two circuits and then give a solution to the 
resulting system of equations. 

47. Consider the following diagram of three circuits. 


3 Q 


4 n 


The jagged lines denote resistors and the numbers next to them give their resistance 
in ohms, written as Q. The breaks in the lines having one short line and one long 
line denote a voltage source which causes the current to flow in the direction which 
goes from the longer of the two lines toward the shorter along the unbroken part of 
the circuit. The current in amps in the four circuits is denoted by /i ,/ 2 ,/3 and it 
is understood that the motion is in the counter clockwise direction. If I *. ends up 
being negative, then it just means the current flows in the clockwise direction. Then 
Kirchhoff’s law states: 

The sum of the resistance times the amps in the counter clockwise direction around a 
loop equals the sum of the voltage sources in the same direction around the loop. 

Find /i,/ 2 ,/ 3 - 

48. Find the rank of the following matrix. 

" 4 -16 -1 -5 ' 

1-4 0-1 

1 -4 -1 -2 
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49. Find the rank of the following matrix. 

" 3 6 5 12 ' 
12 2 5 

12 12 


50. Find the rank of the following matrix. 

' 0 0 -1 0 3 " 

14 10-8 

14 0 12 

-1 -4 0 -1 -2 


51. Find the rank of the following matrix. 

'4 -4 3 -9 ' 
1-11-2 
1-10-3 


52. Find the rank of the following matrix. 

'20101" 
10 10 0 
10 0 17 
10 0 17 


53. Find the rank of the following matrix. 

" 4 15 29 ' 

14 8 

1 3 5 

3 9 15 


54. Find the rank of the following matrix. 

' 00-10 1 " 
12 3-2 -18 

12 2-1 -11 
-1 -2 -2 1 11 
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55. Find the rank of the following matrix. 


" 1 -2 0 3 11 " 

1 -2 0 4 15 

1 -2 0 3 11 

0 0 0 0 0 


56. Find the rank of the following matrix. 

' -2 -3 -2 ' 
111 
1 0 1 
-3 0 -3 


57. Find the rank of the following matrix. 

' 4 4 20 -1 17 " 

115 0 5 

11 5-1 2 

3 3 15 -3 6 


58. Find the rank of the following matrix. 

" -1 3 4 -3 8 ' 

1-3-4 2 -5 

1-3-4 1 -2 

-2 6 8-2 4 

59. Suppose A is an m x n matrix. Explain why the rank of A is always no larger than 
min (m, n ) . 

60. State whether each of the following sets of data are possible for the matrix equation 
AX = B. If possible, describe the solution set. That is, tell whether there exists 
a unique solution, no solution or infinitely many solutions. Here, [A\B\ denotes the 
augmented matrix. 

(a) A is a 5 x 6 matrix, rank (H) = 4 and rank [A\B\ = 4. 

(b) A is a 3 x 4 matrix, rank (H) = 3 and rank [A\B\ = 2. 

(c) A is a 4 x 2 matrix, rank (H) = 4 and rank [A\B\ = 4. 

(d) A is a 5 x 5 matrix, rank (H) = 4 and rank [A\B\ = 5. 

(e) A is a 4 x 2 matrix, rank (H) = 2 and rank [A\B\ = 2. 
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61. Consider the system —5a; + 2y — z = 0 and —5a; — 2y — z = 0. Both equations equal 
zero and so —5a; + 2 y — z = —5a; — 2y — z which is equivalent to y — 0. Does it follow 
that x and z can equal anything? Notice that when x — 1, z — —4, and y — 0 are 
plugged in to the equations, the equations do not equal 0. Why? 

62. Balance the following chemical reactions. 

(a) KN0 3 + H 2 CO s -> K 2 C0 3 + HN0 3 

(b) Agl + Na 2 S Ag 2 S + Nal 

(c) Ba 3 N 2 + H 2 0 -)• Ba ( OH) 2 + NH 3 

(d) CaCl 2 + Na 3 P0 4 -> Ca 3 ( P0 4 ) 2 + NaCl 

63. In the section on dimensionless variables it was observed that pV 2 AB has the units of 
force. Describe a systematic way to obtain such combinations of the variables which 
will yield something which has the units of force. 
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2. Matrices 


2.1 Matrix Arithmetic 


Outcomes 


A. Perform the matrix operations of matrix addition, scalar multiplication, transpo- 
sition and matrix multiplication. Identify when these operations are not defined. 
Represent these operations in terms of the entries of a matrix. 

B. Prove algebraic properties for matrix addition, scalar multiplication, transposi- 
tion, and matrix multiplication. Apply these properties to manipulate an alge- 
braic expression involving matrices. 

C. Compute the inverse of a matrix using row operations, and prove identities in- 
volving matrix inverses. 

E. Solve a linear system using matrix algebra. 

F. Use multiplication by an elementary matrix to apply row operations. 

G. Write a matrix as a product of elementary matrices. 


You have now solved systems of equations by writing them in terms of an augmented 
matrix and then doing row operations on this augmented matrix. It turns out that matrices 
are important not only for systems of equations but also in many applications. 

Recall that a matrix is a rectangular array of numbers. Several of them are referred to 
as matrices. For example, here is a matrix. 

"1 2 3 4 

5 2 8 7 

6-912 

Recall that the size or dimension of a matrix is defined as m x n where m is the number of 
rows and n is the number of columns. The above matrix is a 3 x 4 matrix because there are 
three rows and four columns. You can remember the columns are like columns in a Greek 
temple. They stand upright while the rows lay flat like rows made by a tractor in a plowed 
field. 


( 2 . 1 ) 


59 


When specifying the size of a matrix, you always list the number of rows before the 
number of columns. You might remember that you always list the rows before the columns 
by using the phrase Rowman Catholic. 

Consider the following definition. 


Definition 2.1: Square Matrix 


A matrix A which has size n x n is called a square matrix . In other words, A is a 
square matrix if it has the same number of rows and columns. 


There is some notation specific to matrices which we now introduce. We denote the 
columns of a matrix A by Aj as follows 

A = [ A\ A 2 ■ ■ ■ A n ] 

Therefore, Aj is the j th column of A, when counted from left to right. 

The individual elements of the matrix are called entries or components of A. Elements 
of the matrix are identified according to their position. The (i, j)-entry of a matrix is the 
entry in the i th row and j th column. For example, in the matrix 2.1 above, 8 is in position 
(2, 3) (and is called the (2, 3)-entry) because it is in the second row and the third column. 

In order to remember which matrix we are speaking of, we will denote the entry in the 
i th row and the j th column of matrix A by ci t j . Then, we can write A in terms of its entries, 
as A = [ a,ij ]. Using this notation on the matrix in 2.1, a 2 3 = 8, a 32 = —9, a 12 = 2, etc. 

There are various operations which are done on matrices of appropriate sizes. Matrices 
can be added to and subtracted from other matrices, multiplied by a scalar, and multiplied 
by other matrices. We will never divide a matrix by another matrix, but we will see later 
how matrix inverses play a similar role. 

In doing arithmetic with matrices, we often define the action by what happens in terms 
of the entries (or components) of the matrices. Before looking at these operations in depth, 
consider a few general definitions. 



One possible zero matrix is shown in the following example. 



Note there is a 2 x 3 zero matrix, a 3 x 4 zero matrix, etc. In fact there is a zero matrix 
for every size! 
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Definition 2.4: Equality of Matrices 


Let A and B be two m x n matrices. Then A = B means that for A = [a t j] and 
B = [b t j\ , ciij = bij for all 1 < i < m and 1 < j < n. 


In other words, two matrices are equal exactly when they are the same size and the 
corresponding entries are identical. Thus 


1 

o 

o 



7^ 

’ 0 o ’ 

0 0 

0 0 

0 0 




because they are different sizes. Also, 


0 1 
3 2 




1 0 
2 3 


because, although they are the same size, their corresponding entries are not identical. 
In the following section, we explore addition of matrices. 


2.1.1. Addition of Matrices 


When adding matrices, all matrices in the sum need have the same size. For example, 

" 12 ' 

3 4 
5 2 

and 

-14 8 
2 8 5 

cannot be added, as one has size 3x2 while the other has size 2x3. 

However, the addition 


4 

6 

3 ' 


' 0 

5 

0 ' 

5 

0 

4 

+ 

4 

-4 

14 

11 

-2 

3 


1 

2 

6 


is possible. 

The formal definition is as follows. 
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This definition tells us that when adding matrices, we simply add corresponding entries 
of the matrices. This is demonstrated in the next example. 


r ~ ~ ~ i 

Example 2.6: Addition of Matrices of Same Size 

Add the following matrices, it 

A = 

'possible. 

'12 3 ' 
10 4 

,B = 

1 

Oi 

to to 

^ GO 



Solution. Notice that both A and B are of size 2x3. Since A and B are of the same size, 
the addition is possible. Using Definition 2.5, the addition is done as follows. 


A + B 


'12 3' 


5 2 3' 


1 + 5 

2 + 2 

3 + 3 ' 


6 4 6' 

10 4 

+ 

-6 2 1 


1 H — 6 

0 + 2 

4 + 1 . 


-5 2 5 


□ 


Addition of matrices obeys very much the same properties as normal addition with num- 
bers. Note that when we write for example A + B then we assume that both matrices are 
of equal size so that the operation is indeed possible. 


‘ - 

Proposition 2.7: Properties of Matrix Addition 

1 

Let A, B and C be matrices. Then, the following properties hold. 

• Commutative Law of Addition 

A + B = B + A 

(2.2) 

• Associative Law of Addition 

(A + B) + C = A + (B + C) 

(2.3) 

• Existence of an Additive Identity 

There exists a zero matrix 0 such that 

A + 0 = A 

(2.4) 

• Existence of an Additive Inverse 

There exists a matrix — A such that 

A + (-A) = 0 

(2.5) 


Proof. Consider the Commutative Law of Addition given in 2.2. Let A, B , C, and D be 
matrices such that A + B = C and B + A = D. We want to show that D = C. To do so, we 
will use the definition of matrix addition given in Definition 2.5. Now, 

Cij Ctij T b-jj hjj + (Xij C lij 
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Therefore, C = D because the ij th entries are the same for all i and j. Note that the 
conclusion follows from the commutative law of addition of numbers, which says that if a 
and b are two numbers, then a + b = b + a. The proof of the other results are similar, and 
are left as an exercise. □ 

We call the zero matrix in 2.4 the additive identity. Similarly, we call the matrix —A 
in 2.5 the additive inverse. —A is defined to equal (—1) A = [— a^]. In other words, every 
entry of A is multiplied by —1. In the next section we will study scalar multiplication in 
more depth to understand what is meant by (—1) A. 


2.1.2. Scalar Multiplication of Matrices 


Recall that we use the word scalar when referring to numbers. Therefore, scalar multiplica- 
tion of a matrix is the multiplication of a matrix by a number. To illustrate this concept, 
consider the following example in which a matrix is multiplied by the scalar 3. 


" 1 

2 

3 

4 ' 


3 

6 

9 

12 " 

5 

2 

8 

7 

= 

15 

6 

24 

21 

6 

-9 

1 

2 


18 

-27 

3 

6 


The new matrix is obtained by multiplying every entry of the original matrix by the given 
scalar. 

The formal definition of scalar multiplication is as follows. 



Consider the following example. 


Example 2.9: Effect of Multiplication by a 

1 

Scalar 

Find the result of multiplying the follow 

A = 

ing matri 

"2 O' 
1 -4 

x A by 7. 


Solution. By Definition 2.8, we multiply each element of A by 7. Therefore, 


2 0 ' 


' 7(2) 7(0) ' 


1 

0 

T— I 

1 

1 -4 


. 7(1) 7( — 4) 


7 -28 


Similarly to addition of matrices, there are several properties of scalar multiplication 
which hold. 
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Proposition 2.10: Properties of Scalar Multiplication 


Let A, B be matrices , and k,p be scalars. Then, the following properties hold. 

• Distributive Law over Matrix Addition 

k(A + B) = kA + kB 

• Distributive Law over Scalar Addition 

{k + p) A = kA + pA 

• Associative Law for Scalar Multiplication 

k (j pA ) = (kp) A 

• Rule for Multiplication by 1 

1A = A 


The proof of this proposition is similar to the proof of Proposition 2.7 and is left an 
exercise to the reader. 


2.1.3. Multiplication of Matrices 


The next important matrix operation we will explore is multiplication of matrices. The 
operation of matrix multiplication is one of the most important and useful of the matrix 
operations. Throughout this section, we will also demonstrate how matrix multiplication 
relates to linear systems of equations. 

First, we provide a formal definition of row and column vectors. 
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We may simply use the term vector throughout this text to refer to either a column or 
row vector. If we do so, the context will make it clear which we are referring to. 

In this chapter, we will again use the notion of linear combination of vectors as in Def- 
inition 1.32. In this context, a linear combination is a sum consisting of vectors multiplied 
by scalars. For example, 


50 

122 



+ 9 


3 

6 


is a linear combination of three vectors. 

It turns out that we can express any system of linear equations as a linear combination 
of vectors. In fact, the vectors that we will use are just the columns of the corresponding 
augmented matrix! 


Definition 2.12: The Vector Form of a System of Linear Equations 


Suppose we have a system of equations given by 

anXi + • • • + a\ n x n = bi 

T ■ ■ ■ T o mn x n b m 

We can express this system in vector form which is as follows: 



0 11 


Ol2 



(L\ n 


' h ' 

Xi 

021 

+ x 2 

022 

+ • 

x n 

n 

= 

b 2 


Oml 


Om 2 



^mri 


bm 


Notice that each vector used here is one column from the corresponding augmented 
matrix. There is one vector for each variable in the system, along with the constant vector. 

The first important form of matrix multiplication is multiplying a matrix by a vector. 
Consider the product given by 

7 ' 

8 
9 

We will soon see that this equals 


12 3 
4 5 6 


In general terms, 


7 




50 

122 






X\ 







Oil 

012 

«13 



Oil 

+ X 2 

Ol2 

+ X3 

O13 




X2 

= Xi 



021 

022 

023 


. X 3 . 


0-21 


022 

a 23 












Ull^l + Ul2^2 + O13X3 
CL21X1 + CL22X2 + (I23X3 
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Thus you take X\ times the first column, add to x 2 times the second column, and finally x 3 
times the third column. The above sum is a linear combination of the columns of the matrix. 
When you multiply a matrix on the left by a vector on the right, the numbers making up the 
vector are just the scalars to be used in the linear combination of the columns as illustrated 
above. 

Here is the formal definition of how to multiply an m x n matrix by an n x 1 column 
vector. 



If we write the columns of A in terms of their entries, they are of the form 


a ij 



a 


mj 


Then, we can write the product AX as 



an 


Q'12 


n 

AX = xi 

«21 


0-22 


n 

®ml 

+ x 2 

®m2 

+ • • • + x n 

Cam'll 


Note that multiplication of an m x n matrix and an n x 1 vector produces an m x 1 
vector. 

Here is an example. 
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Example 2.14: A Vector Multiplied by a Matrix 


Compute the product AX for 




' 1 ' 

"12 1 3 ' 

0 2 1-2 

,x = 

2 

0 

1 

2 14 1 





Solution. We will use Definition 2.13 to compute the product. Therefore, we compute the 
product AX as follows. 



1 


2 


1 


3 

1 

0 

+ 2 

2 

+ 0 

1 

+ 1 

-2 


2 


1 


4 


1 


1 


4 


0 


3 

0 

+ 

4 

+ 

0 

+ 

-2 

2 


2 


0 


1 


8 

2 

5 


□ 


Using the above operation, we can also write a system of linear equations in matrix 
form. In this form, we express the system as a matrix multiplied by a vector. Consider the 
following definition. 


Definition 2.15: The Matrix Form of a System of Linear Equations 


Suppose we have a system of equations given by 

auxi + • • • + «i n %n — bi 

021%1 + ' ‘ ‘ + 0,2 n %n = ^2 

T ■ ■ ■ T o mn x n b m 

Then we can express this system in matrix form as follows. 


an ai2 

021 022 

0\n 

02n 


Xi 

X2 


' h ' 

b 2 

Oml O m 2 

Omn 




bm 


This is also known as The Form AX = B. The matrix A is simply the coefficient 
matrix of the system. The vector X is the column vector constructed from the variables of 
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the system, and the vector B is the column vector constructed from the constants of the 
system. It is important to note that any system of linear equations can be written in this 
form. 

Notice that if we write a homogeneous system of equations in matrix form, it would have 
the form AX = 0, for the zero vector 0. 

You can see from this definition that a vector 


Xi 



x n 


will satisfy the equation AX = B only when the entries Xi,X 2 , ■ • • ,x n of the vector X are 
solutions to the original system. 

Now that we have examined how to multiply a matrix by a vector, we wish to consider 
the case where we multiply two matrices of more general sizes, although these sizes still need 
to be appropriate as we will see. For example, in Example 2.14, we multiplied a 3 x 4 matrix 
by a 4 x 1 vector. We want to investigate how to multiply other sizes of matrices. 

We have not yet given any conditions on when matrix multiplication is possible! For 
matrices A and B, in order to form the product AB, the number of columns of A must equal 
the number of rows of B. Consider a product AB where A has size m x n and B has size 
n x p. Then, the product in terms of size of matrices is given by 

these must match! 

(m x n) (n x p ) = m x p 

Note the two outside numbers give the size of the product. One of the most important 
rules regarding matrix multiplication is the following. If the two middle numbers don’t 
match, you can’t multiply the matrices! 

When the number of columns of A equals the number of rows of B the two matrices are 
said to be conformable and the product AB is obtained as follows. 



Consider the following example. 


Example 2.17: Multiplying Two Matrices 


Find AB if possible. 


A = 


12 1 
0 2 1 


B = 


12 0 
0 3 1 
-2 11 


Solution. The first thing you need to verify when calculating a product is whether the 
multiplication is possible. The first matrix has size 2x3 and the second matrix has size 
3x3. The inside numbers are equal, so A and B are conformable matrices. According to 
the above discussion AB will be a 2 x 3 matrix. Definition 2.16 gives us a way to calculate 
each column of AB, as follows. 


First column 

A 


Second column 


Third column 


1 2 
0 2 


1 

0 

-2 





r 2 1 

12 1 


q 

0 2 1 


O 



l 




r o l 

12 1 



0 2 1 


1 



i 


You know how to multiply a matrix times a vector, using Definition 2.13 for each of the 
three columns. Thus 


12 1 
0 2 1 


12 0 
0 3 1 
-2 11 


-19 3 
-2 7 3 


□ 


Since vectors are simply nxlorlxm matrices, we can also multiply a vector by another 
vector. 


r i 

Example 2.18: Vector Times Vector Multiplication 

Multiply if possible 

' 1 ' 

2 

1 

[1 2 1 0 ] . 


Solution. In this case we are multiplying a matrix of size 3 x 1 by a matrix of size 1x4. The 
inside numbers match so the product is defined. Note that the product will be a matrix of 
size 3x4. Using Definition 2.16, we can compute this product as follows 


1 

2 

1 


[12 10 ] 


First column Second column Third column Fourth column " 


' 1 ' 

^ \ Z 0 

' 1 ' 

\ Z 0 

' 1 ' 

\ Z 0 

' 1 ' 


2 

Ik, 

2 

[2], 

2 

[i], 

2 

[0] 

1 


1 


1 


1 
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You can use Definition 2.13 to verify that this product is 


12 10 
2 4 2 0 
12 10 


□ 


Example 2.19: A Multiplication Which is Not Defined 


Find BA if possible. 



1 2 

0 ' 


1 O 1 

B = 

0 3 

1 

,A = 

1 Z 1 

0 2 1 


-2 1 

1 




Solution. First check if it is possible. This product is of the form (3 x 3) (2 x 3) . The inside 
numbers do not match and so yon can’t do this multiplication. □ 

In this case, we say that the multiplication is not defined. Notice that these are the same 
matrices which we used in Example 2.17. In this example, we tried to calculate BA instead 
of AB. This demonstrates another property of matrix multiplication. While the product AB 
maybe be defined, we cannot assume that the product BA will be possible. Therefore, it is 
important to always check that the product is defined before carrying out any calculations. 

Earlier, we defined the zero matrix 0 to be the matrix (of appropriate size) containing 
zeros in all entries. Consider the following example for multiplication by the zero matrix. 



Solution. In this product, we compute 


'12' 


i 

o 

o 


1 

o 

o 

3 4 


1 

o 

o 


1 

o 

o 

1 


Hence, AO = 0. 

Notice that we could also multiply A by the 2x1 zero vector given by 

result would be the 2x1 zero vector. Therefore, it is always the case that AO 
appropriately sized zero matrix or vector. 


□ 


0 

0 


The 


= 0, for an 
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2.1.4. The ij th Entry of a Product 


In previous sections, we used the entries of a matrix to describe the action of matrix addition 
and scalar multiplication. We can also study matrix multiplication using the entries of 
matrices. 

What is the ij th entry of AB? It is the entry in the i th row and the j th column of the 
product AB. 

Now if A is m x n and B is n x p, then we know that the product AB has the form 


Oil 

Cli2 

0 1 n 


bn 

bi2 • ■ 

• hj ■ ■ 

1 

o 21 

a-22 ■ ■ 

«2 n 


&21 

b-22 ■ ■ 

' b 2 j ■ ■ 

&2p 

®ml 

Om2 

&mn 


b n 1 

b n 2 ' ' 

b n j 

bnp 


The j th column of AB is of the form 


an 

®12 

^1 n 


bij 

^21 

022 

a 2n 


b 2 j 

1 

Om2 

ttmn 


1 


which is an m x 1 column vector. 


It is calculated by 


On 

«21 

+ &2j 

0'12 

«22 

+ • ■ 

■ • + b n j 

Oln 

02 n 



Om2 



Omn 


Therefore, the ij th entry is the entry in row i of this vector. This is computed by 

n 

dilblj + + • • • + CLi n b n j = Clikbkj 

k = 1 

The following is the formal definition for the ij th entry of a product of matrices. 
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In other words, to find the (i,j)-entry of the product AB , or (AB)^, you multiply the 
i th row of A, on the left by the j th column of B. To express AB in terms of its entries, we 
write AB = [(AB)^]. 

Consider the following example. 


Example 2.22: The Entries of a Product 


Compute AB if possible. If it is, find the (3,2 )-entry of AB using Definition 2.21. 


' 12 ' 



3 1 

,B = 

'231' 

2 6 

7 6 2 




Solution. First check if the product is possible. It is of the form (3 x 2) (2 x 3) and since the 
inside numbers match, it is possible to do the multiplication. The result should be a 3 x 3 
matrix. We can first compute AB: 


1 2 
3 1 
2 6 


2 

7 


1 2 
3 1 
2 6 


3 

6 


1 2 
3 1 
2 6 


1 

2 


where the commas separate the columns in 
equals 

' 16 
13 
46 


resulting product. Thus the above product 

5 ' 

5 

14 


the 

15 

15 

42 


which is a 3 x 3 matrix as desired. Thus, the (3, 2)-entry equals 42. 


72 


Now using Definition 2.21, we can find that the (3, 2)-entry equals 


d3ki>k2 


k = 1 


«31^12 + O32&22 
2x3 + 6x6 = 42 


Consulting our result for AB above, this is correct! 

You may wish to use this method to verify that the rest of the entries in AB are correct. 

□ 


Here is another example. 


Example 2.23: Finding the Entries of a Product 


Determine if the product AB is defined. If it is, find the (2, l)-entry of the product. 


"231' 


"12" 

7 6 2 

,B = 

3 1 

0 0 0 


2 6 


Solution. This product is of the form (3 x 3) (3 x 2). The middle numbers match so the 
matrices are conformable and it is possible to compute the product. 

We want to find the (2, l)-entry of AB, that is, the entry in the second row and first 
column of the product. We will use Definition 2.21, which states 

n 

^ [ ^ik^kj 
k = 1 


In this case, n = 3, i = 2 and j = 1. Hence the (2, l)-entry is found by computing 


3 

(AB)21 = Cl2kbkl = [ ®21 «22 a 23 ] 

k= 1 


bn 

f>2l 

f>31 


Substituting in the appropriate values, this product becomes 



bn 



' 1 ' 

<221 0,22 &23 ] 

^21 

= [76 


3 

&31 

2 


1x7 + 3x6 + 2x2 = 29 


Hence, (AB ) 2 1 = 29. 

You should take a moment to find a few other entries of AB. You can multiply the 
matrices to check that your answers are correct. The product AB is given by 


AB 


13 13 
29 32 
0 0 


□ 
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2.1.5. Properties of Matrix Multiplication 


As pointed out above, it is sometimes possible to multiply matrices in one order but not in 
the other order. However, even if both AB and BA are defined, they may not be equal. 


Example 2.24: Matrix Multiplication is Not Commutative 

Compare the products AB and BA, for matrices A = 

' 1 2 ' 
3 4 

,B = 

h o 

O 



Solution. First, notice that A and B are both of size 2x2. Therefore, both products AB 
and BA are defined. The first product, AB is 


AB 


1 2 
3 4 


0 1 
1 0 


2 1 
4 3 


The second product, BA is 


1 

o 

h -*■ 


i 

h -*■ 

to 


l 

CO 

1 0 


3 4 


1 2 


Therefore, AB ^ BA. 


□ 


This example illustrates that you cannot assume AB = BA even when multiplication is 
defined in both orders. If for some matrices A and B it is true that AB = BA, then we say 
that A and B commute. This is one important property of matrix multiplication. 

The following are other important properties of matrix multiplication. Notice that these 
properties hold only when the size of matrices are such that the products are defined. 



Proof. First we will prove 2.6. We will use Definition 2.21 and prove this statement using 
the ij th entries of a matrix. Therefore, 

(A ( rB + sC )) - = X a ik ( rB + sC )kj = X aik + sc kj) 

k k 
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= r ^ a ik b k j + s ^2 a ikCkj = r (. AB ) ij + s {AC) t] 

k k 

= (r(AB) + s(AC)) ij 

Thus A ( rB + sC) = r(AB) + s(AC) as claimed. 

The proof of 2.7 follows the same pattern and is left as an exercise. 
Statement 2.8 is the associative law of multiplication. Using Definition 2.21, 

(A (. BC ))y = V (lik ( BC) tj = h ‘ c ‘i 

k k l 

= £ ( AB ) U c„. = ((AS) C)y . 

I 

This proves 2.8. □ 


2.1.6. The Transpose 


Another important operation on matrices is that of taking the transpose. For a matrix A, 
we denote the transpose of A by A T . Before formally defining the transpose, we explore 
this operation on the following matrix. 

' 1 4 
3 1 
2 6 

What happened? The first column became the first row and the second column became 
the second row. Thus the 3x2 matrix became a 2 x 3 matrix. The number 4 was in the 
first row and the second column and it ended up in the second row and first column. 

The definition of the transpose is as follows. 



The (i, j)-entry of A becomes the (j, i)-entry of A T . 
Consider the following example. 


r 

Example 2.27: The Transpose of a 

^ 

Matrix 

Calculate A T for the following matrix 

A = 

' 1 2 -6 ' 
3 5 4 



13 2 
4 16 
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Solution. By Definition 2.26, we know that for A = [a t j], A T = [a.jf . In other words, we 
switch the row and column location of each entry. The (1, 2)-entry becomes the (2, l)-entry. 
Thus, 




1 3 

2 5 
-6 4 


Notice that A is a 2 x 3 matrix, while A T is a 3 x 2 matrix. 


□ 


The transpose of a matrix has the following important properties . 



Proof. First we prove 2. From Definition 2.26, 

(AB) 1 = [(AB)ij] T = [(. AB )ji] = ajkbki = y b ki a jk 

k k 

= E \ b ‘A [<‘«] T = b T aT 

k 

The proof of Formula 3 is left as an exercise. □ 

The transpose of a matrix is related to other important topics. Consider the following 
definition. 



We will explore these definitions in the following examples. 
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Solution. By Definition 2.29, we need to show that A = A T . Now, using Definition 2.26, 


H T = 


2 

1 

3 


1 

5 

-3 


3 

-3 

7 


Hence, A = A T , so A is symmetric. 


□ 



Solution. By Definition 2.29, 


H T 


0 -1 -3 

1 0-2 

3 2 0 


You can see that each entry of A T is equal to —1 times the same entry of A. 
A T = —A and so by Definition 2.29, A is skew symmetric. 


Hence, 

□ 


2.1.7. The Identity and Inverses 


There is a special matrix, denoted /, which is called to as the identity matrix. The identity 
matrix is always a square matrix, and it has the property that there are ones down the main 
diagonal and zeroes elsewhere. Here are some identity matrices of various sizes. 


[ 1 ], 


1 0 
0 1 


1 0 0 
0 1 0 
0 0 1 


10 0 0 
0 10 0 
0 0 10 
0 0 0 1 


The first is the lxl identity matrix, the second is the 2x2 identity matrix, and so on. By 
extension, you can likely see what the n x n identity matrix would be. When it is necessary 
to distinguish which size of identity matrix is being discussed, we will use the notation I n 
for the n x n identity matrix. 

The identity matrix is so important that there is a special symbol to denote the ij th 
entry of the identity matrix. This symbol is given by I %3 = 6ij where Sij is the Kronecker 
symbol defined by 




1 if i = j 
Oif 
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I n is called the identity matrix because it is a multiplicative identity in the following 

sense. 



Proof. The (*, j)-entry of AI n is given by: 


^ ^ Q’ik&kj 


k 

and so AI n = A. The other case is left as an exercise for you. □ 

We now define the matrix operation which in some ways plays the role of division. 



Such a matrix A -1 will have the same size as the matrix A. It is very important to 
observe that the inverse of a matrix, if it exists, is unique. Another way to think of this is 
that if it acts like the inverse, then it is the inverse. 



Proof. In this proof, it is assumed that / is the n x n identity matrix. Let A, B be n x n 
matrices such that A -1 exists and AB = BA = I. We want to show that A -1 = B. Now 
using properties we have seen, we get: 


A -1 = A -1 / = A -1 (AB) = {A- 1 A) B = IB = B 
Hence, A -1 = B which tells us that the inverse is unique. □ 

The next example demonstrates how to check the inverse of a matrix. 


Example 2.35: Verifying the Inverse of a Matrix 

Let A = 

bO 

. Show 

1 

to 

1 

is the inverse of A. 
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Solution. To check this, multiply 


1 


2 -1 ' 


i 

o 

1 

1 2 


i 

1 


0 1 


2 -1 ' 


l 


i 

o 

1 

l 

1 


1 2 


0 1 


showing that this matrix is indeed the inverse of A. □ 


Unlike ordinary multiplication of numbers, it can happen that A ^ 0 but A may fail to 
have an inverse. This is illustrated in the following example. 


Example 2.36: A Nonzero Matrix With No Inverse 

Let A = 

'll' 
1 1 

. Show that A does not have an inverse. 


Solution. One might think A would have an inverse because it does not equal zero. However, 
note that 


1 


l 

! 


1 

o 

1 1 


1 


1 

o 

1 


If A 1 existed, we would have the following 


0 

0 




(WM) 


-1 

1 

-1 

1 





This says that 


1 

o 


1 

1 

1 

1 

o 


1 

1 


which is impossible! Therefore, A does not have an inverse. 


□ 


In the next section, we will explore how to find the inverse of a matrix, if it exists. 
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2.1.8. Finding the Inverse of a Matrix 


In Example 2.35, we were given A” 1 and asked to verify that this matrix was in fact the 
inverse of A. In this section, we explore how to find A ~ 1 . 

Let 

" 1 1 
1 2 


A = 


as in Example 2.35. In order to find A , we need to find a matrix 


x z 
V w 


such that 


'll' 


x z 


i 

O 

i 

1 2 


_ y w 


0 1 


We can multiply these two matrices, and see that in order for this equation to be true, we 
must find the solution to the systems of equations, 

x + y — 1 
x + 2y = 0 


and 


z + w = 0 
z + 2w = l 


Writing the augmented matrix for these two systems gives 


1 

1 

1 2 

o 

1 


for the first system and 


1 

i 

o 

1 2 

1 


(2.9) 


for the second. 

Let’s solve the first system. Take —1 times the first row and add to the second to get 


1 

l 

0 1 

1 

l 


Now take —1 times the second row and add to the first to get 


1 

o 

2 ' 

0 1 

1 

l 


Writing in terms of variables, this says x = 2 and y — — 1. 

Now solve the second system, 2.9 to find z and w. You will End that z — — 1 and w — 1. 
If we take the values found for x, y, z, and w and put them into our inverse matrix, we 
see that the inverse is 

A~' = 


X 

z 


2 

-1 ' 

. y 

w 


-1 

1 
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After taking the time to solve the second system, you may have noticed that exactly 
the same row operations were used to solve both systems. In each case, the end result was 
something of the form [I\X] where / is the identity and X gave a column of the inverse. In 
the above, 

x 

. y . 

the first column of the inverse was obtained by solving the first system and then the second 
column 

z 

w 

To simplify this procedure, we could have solved both systems at once! To do so, we 
could have written 


1 

o 

1 

1 2 

0 1 


and row reduced until we obtained 


' 1 0 

to 

1 

i 

0 1 

i 

i 


and read off the inverse as the 2x2 matrix on the right side. 
This exploration motivates the following important algorithm. 



This algorithm shows how to find the inverse if it exists. It will also tell you if A does 
not have an inverse. 

Consider the following example. 


■ ■ i 

Example 2.38: Finding the Inverse 

Let A = 

'12 2 ' 
10 2 

3 1 -1 

. Find A 1 if it exists. 
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Solution. Set up the augmented matrix 


m 


1 

h 

2 

2 

1 

0 

1 

O 

1 

0 

2 

0 

1 

0 

3 

1 

-1 

0 

0 

1 


Now we row reduce, with the goal of obtaining the 3 x 3 identity matrix on the left hand 
side. First, take —1 times the first row and add to the second followed by —3 times the first 
row added to the third row. This yields 


' 1 

2 

2 

1 

0 

1 

O 

0 

-2 

0 

-1 

1 

0 

0 

-5 

-7 

-3 

0 

1 


Then take 5 times the second row and add to -2 times the third row. 


' 1 

2 

2 

1 

0 

0 " 

0 

-10 

0 

-5 

5 

0 

0 

0 

14 

1 

5 

-2 


Next take the third row and add to —7 times the first row. This yields 


" -7 

-14 

0 

-6 

5 

-2 ' 

0 

-10 

0 

-5 

5 

0 

0 

0 

14 

1 

5 

-2 


Now take — | times the second row and add to the first row. 

5 


' -7 

0 

0 

1 

-2 

-2 " 

0 

-10 

0 

-5 

5 

0 

0 

0 

14 

1 

5 

-2 


Finally divide the first row by - 7 , the second row by -10 and the third row by 14 which yields 


' 1 

0 

0 

1 2 

7 7 

2 ' 

7 

0 

1 

0 

1 1 

2 2 

0 

0 

0 

1 

1 5 

1 


14 14 

7 

_ 




_ 


Notice that the left hand side of this matrix is now the 3 x 3 identity matrix 1 3. Therefore, 
the inverse is the 3 x 3 matrix on the right hand side, given by 

r -i 2 2-1 

7 7 7 


_L A —1 

14 14 7 
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□ 


It may happen that through this algorithm, you discover that the left hand side cannot 
be row reduced to the identity matrix. Consider the following example of this situation. 


r ■ ■ i 

Example 2.39: A Matrix Which Has No Inverse 

Let A = 

'12 2' 
10 2 

2 2 4 

. Find A 1 if it exists. 


Solution. Write the augmented matrix [A\I] 


1 

h -*■ 

2 

2 

1 

0 

1 

o 

1 

0 

2 

0 

1 

0 

2 

2 

4 

0 

0 

1 


and proceed to do row operations attempting to obtain [I\A x ] . Take —1 times the first row 
and add to the second. Then take —2 times the first row and add to the third row. 


" 1 

2 

2 

1 

0 

1 

O 

0 

-2 

0 

-1 

1 

0 

o 

1 

-2 

0 

-2 

0 

1 


Next add —1 times the second row to the third row. 


1 

2 

2 

1 

0 

1 

O 

0 

-2 

0 

-1 

1 

0 

0 

0 

0 

-1 

-1 

1 


At this point, you can see there will be no way to obtain / on the left side of this augmented 
matrix. Hence, there is no way to complete this algorithm, and therefore the inverse of A 
does not exist. In this case, we say that A is not invertible. □ 

If the algorithm provides an inverse for the original matrix, it is always possible to check 
your answer. To do so, use the method demonstrated in Example 2.35. Check that the 
products AA^ 1 and A -1 A both equal the identity matrix. Through this method, you can 
always be sure that you have calculated A -1 properly! 

One way in which the inverse of a matrix is useful is to find the solution of a system 
of linear equations. Recall from Definition 2.15 that we can write a system of equations in 
matrix form, which is of the form AX = B. Suppose you find the inverse of the matrix 
A -1 . Then you could multiply both sides of this equation on the left by A~ x and simplify 
to obtain 

(A" 1 ) AX = A~ l B 
(A~ 1 A)X = A~ l B 

IX = A~ X B 

X = A~ l B 
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Therefore we can find X, the solution to the system, by computing X = A~ l B. Note 
that once you have found A _1 , you can easily get the solution for different right hand sides 
(different B ). It is always just A~ l B. 

We will explore this method of finding the solution to a system in the following example. 



Solution. First, we can write the system of equations in matrix form 


AX 



The inverse of the matrix 



0 1 

-1 1 

1 -1 


is 



Verifying this inverse is left as an exercise. 

From here, the solution to the given system 2.10 is found by 


( 2 . 10 ) 


X 


-o 1 

1 ■ 

2 


' 1 ' 


5 ■ 

2 

y 

z 

= A~ l B = 

1 -1 

1 “I 

0 

1 

2 


3 

2 

= 

-2 

3 

2 


□ 


What if the right side, T>, of 
solution to 

1 

1 

1 


2.10 had been 


0 

1 

3 


? In other words, what would be the 


0 

-1 

1 


1 ' 


X 


" 0 ' 

1 


y 

= 

1 

1 


z 


3 
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By the above discussion, the solution is given by 


X 


r o i ii 
2 2 


" 0 ' 


2 ' 

y 

z 

= A~ l B = 

1 -1 0 

1 _! _! 

2 2 


1 

3 

= 

-1 

-2 


This illustrates that for a system AX = B where A -1 exists, it is easy to find the solution 
when the vector B is changed. 

We conclude this section with some important properties of the inverse. 



Consider the following theorem. 



2.1.9. Elementary Matrices 


We now turn our attention to a special type of matrix called an elementary matrix. An 
elementary matrix is always a square matrix. Recall the row operations given in Definition 
1.11. Any elementary matrix, which we often denote by E , is obtained from applying one 
row operation to the identity matrix of the same size. 

For example, the matrix 
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is the elementary matrix obtained from switching the two rows. The matrix 


E = 


1 0 0 
0 3 0 
0 0 1 


is the elementary matrix obtained from multiplying the second row of the 3x3 identity 
matrix by 3. The matrix 


is the elementary matrix obtained from adding —3 times the first row to the third row. 

You may construct an elementary matrix from any row operation, but remember that 
you can only apply one operation. 

Consider the following definition. 


Definition 2.43: Elementary Matrices and Row Operations 


Let E be an n x n matrix. Then E is an elementary matrix if it is the result of 
applying one row operation to the n x n identity matrix I n . 

Those which involve switching rows of the identity matrix are called permutation 
matrices. 


Therefore, E constructed above by switching the two rows of I 2 is called a permutation 
matrix. 

Elementary matrices can be used in place of row operations and therefore are very useful. 
It turns out that multiplying (on the left hand side) by an elementary matrix E will have 
the same effect as doing the row operation used to obtain E. 

The following theorem is an important result which we will use throughout this text. 


Theorem 2.44: Multiplication by an Elementary Matrix and 
Row Operations 


To perform any of the three row operations on a matrix A it suffices to take the product 
EA, where E is the elementary matrix obtained by using the desired row operation 
on the identity matrix. 


Therefore, instead of performing row operations on a matrix A, we can row reduce through 
matrix multiplication with the appropriate elementary matrix. We will examine this theorem 
in detail for each of the three row operations given in Definition 1.11. 

First, consider the following lemma. 


Lemma 2.45: Action of Permutation Matrix 


Let P l i denote the elementary matrix which involves switching the i th and the j th 
rows. Then P l i is a permutation matrix and 

P ij A = B 

where B is obtained from A by switching the i th and the j th rows. 


We will explore this idea more in the following example. 


Example 2.46: Switching Rows with an Elementary Matrix 



1 

o 

h- 1 

O 


a b 



1 0 0 

,41 = 

9 d 


_ 0 0 1 _ 


1 


Find B where B = P 12 A. 


Solution. You can see that the matrix P 12 is obtained by switching the first and second rows 
of the 3x3 identity matrix I. 

Using our usual procedure, compute the product P l2 A = B. The result is given by 


B 


9 d 
a b 

e f 


Notice that B is the matrix obtained by switching rows 1 and 2 of A. Therefore by multi- 
plying A by P 12 , the row operation which was applied to I to obtain P 12 is applied to A to 
obtain B. □ 


Theorem 2.44 applies to all three row operations, and we now look at the row operation 
of multiplying a row by a scalar. Consider the following lemma. 


Lemma 2.47: Multiplication by a Scalar and Elementary Matrices 


Let E (k, i) denote the elementary matrix corresponding to the row operation in which 
the i th row is multiplied by the nonzero scalar, k. Then 

E (k,i) A — B 

where B is obtained from A by multiplying the i th row of A by k. 


We will explore this lemma further in the following example. 
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Example 2.48: Multiplication of a Row by 5 Using Elementary Matrix 


Let 

£(5,2) = 

Find the matrix B where B = E (5, 2) A 


" 1 

0 

0 ' 


a 

b ' 

0 

5 

0 

,A = 

c 

d 

0 

0 

1 


e 

f 


Solution. You can see that E ( 5,2) is obtained by multiplying the second row of the identity 
matrix by 5. 

Using our usual procedure for multiplication of matrices, we can compute the product 
E ( 5,2) A. The resulting matrix is given by 


B 


a b 
5c 5 d 
e / 


Notice that B is obtained by multiplying the second row of A by the scalar 5. □ 

There is one last row operation to consider. The following lemma discusses the final 
operation of adding a multiple of a row to another row. 


Lemma 2.49: Adding Multiples of Rows and Elementary Matrices 


Let E [k x i + j) denote the elementary matrix obtained from I by adding k times the 
i th row to the j th . Then 

E(kxi+j)A = B 

where B is obtained from A by adding k times the i th row to the j th row of A. 


Consider the following example. 


Example 2.50: Adding Two Times the First Row to the Last 


Let 


Find B where B 


E{ 2x1 + 3) 


" 1 

0 

0 ' 


a 

b ' 

0 

1 

0 

,A = 

c 

d 

2 

0 

1 


e 

f 


E(2 x 1 + 3) A. 


Solution. You can see that the matrix E (2 x 1 + 3) was obtained by adding 2 times the first 
row of / to the third row of /. 

Using our usual procedure, we can compute the product E (2 x 1 + 3) A. The resulting 
matrix B is given by 


B 


a b 

c d 

2 a + e 2 b + f 
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You can see that B is the matrix obtained by adding 2 times the first row of A to the 
third row. □ 

Suppose we have applied a row operation to a matrix A. Consider the row operation 
required to return A to its original form, to undo the row operation. It turns out that this 
action is how we find the inverse of an elementary matrix E. 

Consider the following theorem. 


Theorem 2.51: Elementary Matrices and Inverses 


Every elementary matrix is invertible and its inverse is also an elementary matrix. 


In fact, the inverse of an elementary matrix is constructed by doing the reverse row 
operation on I. E~ 1 will be obtained by performing the row operation which would carry E 
back to I. 

• If E is obtained by switching rows i and j, then E -1 is also obtained by switching rows 
i and j. 

• If E is obtained by multiplying row i by the scalar k, then E~ 1 is obtained by multi- 
plying row i by the scalar 1. 

• If E is obtained by adding k times row i to row j, then E~ l is obtained by subtracting 
k times row i from row j. 


Consider the following example. 


r ■ i 

Example 2.52: Inverse of an Elementary Matrix 

Let 

TA 

' 10 ' 


Lj — 

0 2 


Find E _1 . 




Solution. Consider the elementary matrix E given by 


E = 


1 0 
0 2 


Here, E is obtained from the 2x2 identity matrix by multiplying the second row by 2. In 
order to carry E back to the identity, we need to multiply the second row of E by |. Hence, 
E~ x is given by 


We can verify that EE 1 = I. Take the product EE 1 , given by 


EE 


l _ 

l 

o 


l 

o 


1 

o 

1 


0 2 


1 

’-'I'M 

o 

1 


0 1 


89 


This equals / so we know that we have compute E 1 properly. 


0 

Suppose an m x n matrix A is row reduced to its reduced row-echelon form. By tracking 
each row operation completed, this row reduction can be completed through multiplication 
by elementary matrices. Consider the following definition. 


Definition 2.53: The Form B = UA 


Let A be an m x n matrix and let B be the reduced row-echelon form of A. Then we 
can write B = U A where U is the product of all elementary matrices representing the 
row operations done to A to obtain B. 


Consider the following example. 


Example 2.54: The Form B = UA 

Let A = 

form B = 

'01' 
1 0 

2 0 
UA. 

. Find B, the reduced row-echelon form of A and write it in the 


Solution. To find B, row reduce A. For each step, we will record the appropriate elementary 
matrix. First, switch rows 1 and 2. 


' 0 

1 ' 


' 1 

0 ' 

1 

0 


0 

1 

2 

0 


2 

0 


The resulting matrix is equivalent to finding the product of P 12 
Next, add (—2) times row 1 to row 3. 


" 1 

0 ' 


' 1 

0 ' 

0 

1 

-> 

0 

1 

2 

0 


0 

0 


0 1 0 
1 0 0 
0 0 1 


and A. 


This is equivalent to multiplying by the matrix E(— 2 x 1 + 3) = 


1 0 0 
0 1 0 
-2 0 1 

that the resulting matrix is B, the required reduced row-echelon form of A. 

We can then write 


Notice 


B = E(-2xl + 2)(P 12 A) 
= (E(-2 x 1 + 2)P 12 ) A 
= UA 
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It remains to find the matrix U. 


U 


E (- 2 
1 
0 

-2 

" 0 

1 

0 


x 1 + 2)P 12 


0 0 
1 0 
0 1 

1 0 
0 0 
2 1 


0 

1 

0 


1 

0 

0 


0 

0 

1 


We can verify that B = UA holds for this matrix U : 



l 

O 

o 

1 


'01' 

UA = 

1 0 0 


1 0 


0-2 1 


2 0 


1 0 
0 1 
0 0 

= B 


□ 

While the process used in the above example is reliable and simple when only a few row 
operations are used, it becomes cumbersome in a case where many row operations are needed 
to carry AtoB. The following theorem provides an alternate way to find the matrix U. 


Theorem 2.55: Finding the Matrix U 


Let A be an m x n matrix and let B he its reduced row-echelon form. Then B = U A 
where U is an invertible m x m matrix found by forming the matrix [v4|/ m ] and row 
reducing to [R|£/]. 


Let’s revisit the above example using the process outlined in Theorem 2.55. 


Example 2.56: The Form B = U A, Revisited 

Let A = 

B = UA. 

"01" 
1 0 

2 0 

. Using the process outlined in Theorem 2.55, find U such that 


Solution. First, set up the matrix [A|/ m ]. 


1 

O 

1 

1 

0 

1 

O 

1 

0 

0 

1 

0 

2 

0 

0 

0 

1 
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Now, row reduce this matrix until the left side equals the reduced row-echelon form of A. 


1 

o 

o 

o 

1 


i 

h 

O 

o 

h-L 

o 

1 

1 0 

0 1 0 

->■ 

0 1 

1 0 0 

. 2 0 

0 0 1 


2 0 

0 0 1 




' 1 0 

0 1 0 




0 1 

1 0 0 




0 0 

0-2 1 


The left side of this matrix is B , and the right side is U . Comparing this to the matrix 
U found above in Example 2.54, you can see that the same matrix is obtained regardless of 
which process is used. □ 

Recall from Algorithm 2.37 that an n x n matrix A is invertible if and only if A can 
be carried to the n x n identity matrix using the usual row operations. This leads to an 
important consequence related to the above discussion. 

Suppose A is an n x n invertible matrix. Then, set up the matrix [A|/ n ] as done above, 
and row reduce until it is of the form [R|£/]. In this case, B = I n because A is invertible. 

B = UA 
In = UA 
U - 1 = A 

Now suppose that U = E\ E 2 ■ ■ ■ Ek where each Ei is an elementary matrix representing 
a row operation used to carry A to I. Then, 

U~ l = {E,E 2 • • ■ Ek )' 1 = E - 1 ■ ■ ■ E- x E,-l 

Remember that if Ei is an elementary matrix, so too is E ^ 1 . It follows that 

A = U- 1 

= E ^-.. e ^ Ei -1 

and A can be written as a product of elementary matrices. 


Theorem 2.57: Product of Elementary Matrices 


Let A be an n x n matrix. Then A is invertible if and only if it can be written as a 
product of elementary matrices. 


Consider the following example. 


r - i 

Example 2.58: Product of Elementary Matrices 

Let A = 

" 0 10' 
1 1 0 

0-2 1 

. Write A as a product of elementary matrices. 
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Solution. We will use the process outlined in Theorem 2.55 to write A as a product of 
elementary matrices. We will set up the matrix [A\I] and row reduce, recording each row 
operation as an elementary matrix. 


First: 


" 0 

1 

0 

1 

0 

0 ' 


' 1 

1 

0 

0 

1 

0 ' 

1 

1 

0 

0 

1 

0 

-A 

0 

1 

0 

1 

0 

0 

0 

-2 

1 

0 

0 

1 


0 

-2 

1 

0 

0 

1 


represented by the elementary matrix E\ = 
Secondly: 


0 1 0 
1 0 0 
0 0 1 


' 1 

1 

0 

0 

1 

0 ' 


' 1 

0 

0 

-1 

1 

0 ' 

0 

1 

0 

1 

0 

0 

-A 

0 

1 

0 

1 

0 

0 

0 

-2 

1 

0 

0 

1 


0 

-2 

1 

0 

0 

1 


represented by the elementary matrix E 2 
Finally: 


1 -1 0 

0 1 0 

0 0 1 


" 1 

0 

0 

-1 

1 

0 ' 


' 1 

0 

0 

-1 

1 

0 ' 

0 

1 

0 

1 

0 

0 

-A 

0 

1 

0 

1 

0 

0 

0 

-2 

1 

0 

0 

1 


0 

0 

1 

2 

0 

1 


represented by the elementary matrix E 3 = 


1 0 0 
0 1 0 
0 2 1 

Notice that the reduced row-echelon form of A is /. Hence / 
product of the above elementary matrices. It follows that A = U~ l . 


= U A where U is the 
Since we want to write 


A as a product of elementary matrices, we wish to express U 1 as a product of elementary 
matrices. 


u~ l = {E^E^y 1 

= E^E^Ey 


1 

0 

h ^ 

0 

1 


'110' 


1 

0 

0 

r- H 

1 0 0 


0 1 0 


0 1 0 

. 0 0 1 . 


0 0 1 


0-2 1 


= A 

This gives A written as a product of elementary matrices. By Theorem 2.57 it follows 
that A is invertible. □ 


2.1.10. More on Matrix Inverses 


In this section, we will prove three theorems which will clarify the concept of matrix inverses. 
In order to do this, first recall some important properties of elementary matrices. 


93 


Recall that an elementary matrix is a square matrix obtained by performing an elemen- 
tary operation on an identity matrix. Each elementary matrix is invertible, and its inverse 
is also an elementary matrix. If E is an m x m elementary matrix and A is an mxn matrix, 
then the product EA is the result of applying to A the same elementary row operation that 
was applied to the m x m identity matrix in order to obtain E. 

Let R be the reduced row-echelon form of an rn x n matrix A. R is obtained by itera- 
tively applying a sequence of elementary row operations to A. Denote by E 1: E 2 ,- ■ ■ , E k the 
elementary matrices associated with the elementary row operations which were applied, in or- 
der, to the matrix A to obtain the resulting R. We then have that R = ( E • • • (E 2 (EdA))) = 
E/, ■ ■ ■ E 2 EiA. Let E denote the product matrix Ed ■ ■ ■ E 2 E\ so that we can write R = EA 
where E is an invertible matrix whose inverse is the product (Ed) -1 (Ed) -1 • • • (Ed) -1 . 

Now, we will consider some preliminary lemmas. 


Lemma 2.59: Invertible Matrix and Zeros 


Suppose that A and B are matrices such that the product AB is an identity matrix. 
Then the reduced row-echelon form of A does not have a row of zeros. 


Proof: Let R be the reduced row-echelon form of A. Then R = EA for some invertible 
square matrix E as described above. By hypothesis AB = I where I is an identity matrix, 
so we have a chain of equalities 

RiBE- 1 ) = (EA)(BE~ 1 ) = E(AB)E~ 1 = EIE -1 = EEE -1 = I 

If R would have a row of zeros, then so would the product R^BE^ 1 ). But since the identity 
matrix I does not have a row of zeros, neither can R have one. ■ 

We now consider a second important lemma. 


Lemma 2.60: Size of Invertible Matrix 


Suppose that A and B are matrices such that the product AB is an identity matrix. 
Then A has at least as many columns as it has rows. 


Proof: Let R be the reduced row-echelon form of A. By Lemma 2.59, we know that R 
does not have a row of zeros, and therefore each row of R has a leading 1. Since each column 
of R contains at most one of these leading Is, R must have at least as many columns as it 
has rows. ■ 

An important theorem follows from this lemma. 



Proof: Suppose that A and B are matrices such that both products AB and BA are 
identity matrices. We will show that A and B must be square matrices of the same size. Let 
the matrix A have m rows and n columns, so that A is an m x n matrix. Since the product 
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AB exists, B must have n rows, and since the product BA exists, B must have m columns 
so that B is an n x m matrix. To finish the proof, we need only verify that m = n. 

We first apply Lemma 2.60 with A and B, to obtain the inequality m < n. We then apply 
Lemma 2.60 again (switching the order of the matrices), to obtain the inequality n < m. It 
follows that m = n, as we wanted. ■ 

Of course, not all square matrices are invertible. In particular, zero matrices are not 
invertible, along with many other square matrices. 

The following proposition will be useful in proving the next theorem. 


Proposition 2.62: Reduced Row-Echelon Form of a Square Matrix 


If R is the reduced row-echelon form of a square matrix, then either R has a row of 
zeros or R is an identity matrix. 


The proof of this proposition is left as an exercise to the reader. We now consider the 
second important theorem of this section. 



Proof: Let R be the reduced row-echelon form of a square matrix A. Then, R = EA 
where E is an invertible matrix. Since AB = /, Lemma 2.59 gives us that R does not have 
a row of zeros. By noting that R is a square matrix and applying Proposition 2.62, we see 
that R — I. Hence, EA = I. 

Using both that EA = I and AB = /, we can finish the proof with a chain of equalities 
as given by 

BA = IBIA = ( EA)B{E~ 1 E)A 
= E(AB)E~\EA ) 

= EIE~ l I 
= EE' 1 = I 

It follows from the definition of the inverse of a matrix that B = A~ l and A = B~ x . ■ 

This theorem is very useful, since with it we need only test one of the products AB or 
BA in order to check that B is the inverse of A. The hypothesis that A and B are square 
matrices is very important, and without this the theorem does not hold. 

We will now consider an example. 
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Example 2.64: Non Square Matrices 


Let 


A = 


Show that A T A = I but AA T ^ 0. 


1 0 
0 1 
0 0 


Solution. Consider the product A T A given by 


1 0 0 
0 1 0 


1 0 
0 1 
0 0 


1 0 
0 1 


Therefore, A T A 


I 2 , where J 2 is the 2x2 identity matrix. 


However, the product AA T is 


1 0 
0 1 
0 0 


1 0 0 
0 1 0 


1 0 0 
0 1 0 
0 0 0 


Hence AAA is not the 3x3 identity matrix. This shows that for Theorem 2.63, it is essential 
that both matrices be square and of the same size. □ 


Is it possible to have matrices A and B such that AB = /, while BA = 0? This question 
is left to the reader to answer, and you should take a moment to consider the answer. 

We conclude this section with an important theorem. 


Theorem 2.65: The Reduced Row-Echelon Form of an Invertible Matrix 


For any matrix A the following conditions are equivalent: 

• A is invertible 

• The reduced row-echelon form of A is an identity matrix 


Proof. In order to prove this, we show that for any given matrix A, each condition implies the 
other. We first show that if A is invertible, then its reduced row-echelon form is an identity 
matrix, then we show that if the reduced row-echelon form of A is an identity matrix, then 
A is invertible. 

If A is invertible, there is some matrix B such that AB = I. By Lemma 2.59, we get 
that the reduced row-echelon form of A does not have a row of zeros. Then by Theorem 
2.61, it follows that A and the reduced row-echelon form of A are square matrices. Finally, 
by Proposition 2.62, this reduced row-echelon form of A must be an identity matrix. This 
proves the first implication. 

Now suppose the reduced row-echelon form of A is an identity matrix I. Then / = EA 
for some product E of elementary matrices. By Theorem 2.63, we can conclude that A is 
invertible. □ 
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Theorem 2.65 corresponds to Algorithm 2.37, which claims that A -1 is found by row 
reducing the augmented matrix [A\I] to the form [J|A -1 ]. This will be a matrix product 
E [A\I] where A is a product of elementary matrices. By the rules of matrix multiplication, 
we have that E { A\I ] = [EA\EI] = [AA|A]. 

It follows that the reduced row-echelon form of [A|J] is [EA\E], where EA gives the 
reduced row-echelon form of A. By Theorem 2.65, if EA ^ /, then A is not invertible, and 
if EA = /, A is invertible. If EA = /, then by Theorem 2.63, E = A -1 . This proves that 
Algorithm 2.37 does in fact find A -1 . 


2.1.11. Exercises 


1. For the following pairs of matrices, determine if the sum A + B is defined. If so, find 
the sum. 


(a) A 


1 0 
0 1 




0 1 
1 0 


(b) A 

(c) A 


2 12 

i i 

to 

1 

1 

O 

CO 

1 

1 1 0 

0 14 

1 0 ' 
-2 3 

to 

'2 7 -1 ' 
0 3 4 

4 2 




2. For each matrix A, find the matrix —A such that A + (—A) = 0. 


(a) A 

(b) A 


(c) A 


1 2 
2 1 

-2 3 ' 

0 2 

0 12 
1 -1 3 
4 2 0 


3. In the context of Proposition 2.7, describe —A and 0. 


4. For each matrix A, find the product (— 2)A, 0A, and 3A. 


(a) A 

(b) A 


(c) A 


1 2 
2 1 

-2 3 ' 

0 2 

0 12 
1 -1 3 
4 2 0 
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5. Using only the properties given in Proposition 2.7 and Proposition 2.10, show — A is 
unique. 

6. Using only the properties given in Proposition 2.7 and Proposition 2.10 show 0 is 
unique. 


7. Using only the properties given in Proposition 2.7 and Proposition 2.10 show 0 A = 0. 
Here the 0 on the left is the scalar 0 and the 0 on the right is the zero matrix of 
appropriate size. 


8. Using only the properties given in Proposition 2.7 and Proposition 2.10, as well as 
previous problems show (—1) A = — A . 


9. Consider the matrices A = 


12 3 
2 17 


,B = 


3 

-3 


-1 2 

2 1 


,C = 


1 2 
3 1 


,D = 


-1 

2 



2 

3 


Find the following if possible. If it is not possible explain why. 


(a) -3A 

(b) 3 B-A 

(c) AC 

(d) CB 

(e) AE 

(f) EA 


10. Consider the matrices A = 


' 1 

2 ' 

|- 

3 

2 

,B = 

1 

1 

T— I 

1 

- 


2 

-3 


-5 

2 


,C = 


-1 1 
4 -3 


E = 


1 

3 


Find the following if possible. If it is not possible explain why. 


1 2 
5 0 


,D = 


(a) 

-3A 

(b) 

3 B-A 

(c) 

AC 

(d) 

CA 

(e) 

AE 

(f) 

EA 

(g) 

BE 

(h) 

DE 
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11. Let A = 

1 1 ' 
-2 -1 

, B = 

' 1 -1 -2 ' 

, and C = 

1 1 -3 ' 

-12 0 

following i: 

1 2 
? possible. 


2 1-2 

-3 -1 0 


. Find the 


(a) AB 

(b) BA 

(c) AC 

(d) CA 

(e) CB 

(f) BC 


12. Let A = 


-1 -1 
3 3 


. Find all 2x2 matrices, B such that AB = 0. 


13. Let X=[~l -1 1 ] and Y = [ 0 1 2 ] . Find X T Y and XY T if possible. 


14. Let A = 


1 2 
3 4 

so, what should k equal? 


,B = 


15. Let A = 


1 2 
3 4 

so, what should k equal? 


,B = 


1 2 
3 k 

1 2 
1 k 


. Is it possible to choose k such that AB = BA1 If 


. Is it possible to choose k such that AB = BA1 If 


16. Find 2x2 matrices, A, B, and C such that A ^ 0, C ^ B, but AC = AB. 


17. Give an example of matrices (of any size), A,B,C such that B ^ C, A ^ 0, and yet 
AB = AC. 


18. Find 2x2 matrices A and B such that 4^0 and B ^ 0 but AB = 0. 

19. Give an example of matrices (of any size), A, B such that 1 / 0 and B ^ 0 but 
AB = 0. 


20. Find 2x2 matrices A and B such that 4^0 and B ^ 0 with AB ^ BA. 


21. Write the system 

xi - x 2 + 2x 3 
2 x 3 + x x 
3^3 

3^4 + 3X2 + Xi 


in the form A 


Xl 

x 2 

x 3 

X4 


where A is an appropriate matrix. 
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22. Write the system 


x x + 3x 2 + 2x 3 
2x 3 + xi 
6x 3 

x 4 + 3x 2 + Xi 

x -1 

X2 

in the form A " where A is an appropriate matrix. 

x 3 

x 4 

23. Write the system 

xi + x 2 + x 3 
2x 3 + Xi + x 2 
x 3 - x 4 
3x 4 + X\ 

Xi 

X2 

in the form A where A is an appropriate matrix. 

x 3 

x 4 

24. A matrix A is called idempotent if A 2 = A. Let 

" 2 0 2 ' 

A = 11 2 

-1 0 -1 

and show that A is idempotent . 

25. For each pair of matrices, find the (1, 2)-entry and (2, 3)-entry of the product AB. 

"12-1] [" 46-2" 

(a) A = 3 4 0 ,B = 7 2 1 

25 lj [-10 0 

"13 11 (" 2 3 0 " 

(b) A = 0 2 4 , B = -4 16 1 

105j [022 

26. Suppose A and B are square matrices of the same size. Which of the following are 
necessarily true? 

(a) (A — B) 2 = A 2 — 2AB + B 2 

(b) (AB) 2 = A 2 B 2 

(c) (A + Bf = A 2 + 2AB + B 2 

(d) (A + B ) 2 = A 2 + AB + BA + B 2 

(e) A 2 B 2 = A (AB) B 
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(f) (A + B ) 3 = A 3 + 3 A 2 B + 3 AB 2 + B 3 

(g) (A + B) (A -B)= A 2 - B 2 


27. Consider the matrices A = 


-1 1 
4 -3 


= 


1 

3 


" 1 2 ' 


3 2 

,B = 

1 -1 

- 


2 

-3 


-5 

2 


,C 


Find the following if possible. If it is not possible explain why. 


(a) 

— 3/L t 

(b) 

3 B-A t 

(c) 

E t B 

(d) 

ee t 

(e) 

b t b 

(f) 

CA T 

(g) 

D t BE 


1 2 
5 0 


,D = 


28. Let A be an n x n matrix. Show A equals the sum of a symmetric and a skew symmetric 
matrix. Hint: Show that | (kL T + A ) is symmetric and then consider using this as 
one of the matrices. 


29. Show that the main diagonal of every skew symmetric matrix consists of only zeros. 
Recall that the main diagonal consists of every entry of the matrix which is of the form 

da. 

30. Prove 3. That is, show that for an rn x n matrix A, an n x p matrix B, and scalars 
r, s, the following holds: 

(■ rA + sB) 1 = rA T + sB T 

31. Prove that I m A = A where A is an m x n matrix. 

32. Suppose AB = AC and A is an invertible n x n matrix. Does it follow that B = Cl 
Explain why or why not. 

33. Suppose AB = AC and A is a non invertible n x n matrix. Does it follow that B = Cl 
Explain why or why not. 

34. Give an example of a matrix A such that A 2 = I and yet A ^ I and A ^ — I. 

35. Let 


Find A 1 if possible. If A 1 does not exist, explain why. 
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36. Let 


Find A 1 if possible. If A 1 does not exist, explain why. 


37. Let 


Find A 1 if possible. If A 1 does not exist, explain why. 


38. Let 


Find A 1 if possible. If A 1 does not exist, explain why. 

CL b 

39. Let A be a 2 x 2 invertible matrix, with A = 

c a 

terms of a, b, c, d. 

40. Let 

12 3 

A= 2 14 

10 2 

Find A if possible. If A -1 does not exist, explain why. 

41. Let 

'10 3' 

A= 2 3 4 

1 0 2 

Find A -1 if possible. If A -1 does not exist, explain why. 

42. Let 

"12 3 ' 

A= 2 14 
4 5 10 

Find A ~ 1 if possible. If A~ x does not exist, explain why. 

43. Let 

"12 02 ' 
1120 
2 1-32 
12 12 

Find A~ v if possible. If A~ l does not exist, explain why. 


Find a formula for A 1 in 


44. Using the inverse of the matrix, find the solution to the systems: 
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(b) 


Now give the solution in terms of a and b to 


CN 

i 


X 


' i ' 

1 1 


. y . 


2 

' 24 ' 


X 


' 2 ' 

1 1 


. y . 


0 


' 2 

4 ' 


X 


a 

1 

1 


. y . 


b 


45. Using the inverse of the matrix, find the solution to the systems: 
(a) 


"10 3' 


X 


' 1 ' 

2 3 4 

10 2 


y 

z 

— 

0 

1 


(b) 


' 1 

1 

CO 

o 


X 


3 ' 

2 

1 

3 4 

0 2 


y 

z 

— 

-1 

-2 


Now give the solution in terms of a, b , and c to the following: 


' 1 

1 

CO 

o 


X 


a 

2 

1 

3 4 

0 2 _ 


y 

z 

— 

b 

c 


46. Show that if A is an n x n invertible matrix and X is a n x 1 matrix such that AX = B 
for B an n x 1 matrix, then X = A~~ 1 B. 

47. Prove that if A _1 exists and AX = 0 then X = 0. 

48. Show that if A -1 exists for an n x n matrix, then it is unique. That is, if BA = I and 
AB = /, then B = A -1 . 

49. Show that if A is an invertible n x n matrix, then so is A T and ( A T ) 1 = (7U 1 ) 

50. Show {AB) 1 = B~ 1 A~ 1 by verifying that 

AB (B^A- 1 ) = I 

and 


-n T 


B-'A- 1 {AB) = I 


Hint: Use Problem 48. 
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51. Show that {ABC)~ l = C~ X B- X A~ X by verifying that {ABC) {C^B^A- 1 ) = I and 
{C-'B-'A- 1 ) {ABC) = I. Hint: Use Problem 48. 

52. If A is invertible, show (A 2 ) -1 = {A~ 1 ) 2 . Hint: Use Problem 48. 

53. If A is invertible, show (A -1 ) -1 = A. Hint: Use Problem 48. 


54. Let A 


2 3 
1 2 


Suppose a row operation is applied to A and the result is B — 


1 2 
2 3 


Find the elementary matrix E that represents this row operation. 


55. Let A 


4 0 
2 1 


Suppose a row operation is applied to A and the result is B — 


8 0 
2 1 


Find the elementary matrix E that represents this row operation. 


56. Let A = 

“ 1 -3 
2 -1 


57. Let A = 


1 —3 

. Suppose a row operation is applied to A and the result is B — 

U 0 

. Find the elementary matrix E that represents this row operation. 


B = 


1 

2 

0 


1 2 1 

0 5 1 

2-14 
2 1 
-1 4 
5 1 


. Suppose a row operation is applied to A and the result is 


(a) Find the elementary matrix E such that EA = B. 

(b) Find the inverse of E, E~ x , such that E~ X B = A. 


Let A = 


B = 


1 2 1 
0 5 1 

2-14 
1 2 1 
0 10 2 
2-14 


Suppose a row operation is applied to A and the result is 


(a) Find the elementary matrix E such that EA = B. 

(b) Find the inverse of E, E~ l , such that E~ l B = A. 


59. Let A = 


B = 


1 2 1 
0 5 1 

2-14 
1 2 1 

0 5 1 

1 —1 2 


. Suppose a row operation is applied to A and the result is 
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(a) Find the elementary matrix E such that EA = B. 

(b) Find the inverse of E, E -1 , such that E~ 1 B = A. 


60. Let A = 


B = 


1 

2 

2 


1 

0 

2 

2 

4 

-1 


2 

5 

-1 


1 

5 

4 


1 

1 

4 


Suppose a row operation is applied to A and the result is 


(a) Find the elementary matrix E such that EA = B. 

(b) Find the inverse of E, E -1 , such that E~ l B = A. 
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3. Determinants 


3.1 Basic Techniques and Properties 


Outcomes 


A. Evaluate the determinant of a square matrix using either Laplace Expansion or 
row operations. 

B. Demonstrate the effects that row operations have on determinants. 

C. Verify the following: 

(a) The determinant of a product of matrices is the product of the determinants. 

(b) The determinant of a matrix is equal to the determinant of its transpose. 


3.1.1. Cofactors and 2x2 Determinants 


Let A be an nxn matrix. That is, let A be a square matrix. The determinant of A, denoted 
by det (A) is a very important number which we will explore throughout this section. 

If A is a 2x2 matrix, the determinant is given by the following formula. 


Definition 3.1: 

Determinant of a Two By Two Matrix 

Let A = 

a b 
c d 

. Then 

det (A) = ad — cb 


The determinant is also often denoted by enclosing the matrix with two vertical lines. 
Thus 


det 


a b 


a b 

c d 


c d 


= ad — be 


The following is an example of finding the determinant of a 2 x 2 matrix. 
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Example 3.2: A Two by Two Determinant 


Find det (A) for the matrix A = 


2 4 
-1 6 


Solution. From Definition 3.1, 

det (A) = (2) (6) - (-1) (4) = 12 + 4 = 16 


The 2x2 determinant can be used to find the determinant of larger matrices. We will 
now explore how to find the determinant of a 3 x 3 matrix, using several tools including the 
2x2 determinant. 

We begin with the following definition. 


Definition 3.3: The ij th Minor of a Matrix 


Let A he a 3 x 3 matrix. The ij th minor of A, denoted as minor (A) i ■ , is the 
determinant of the 2x2 matrix which results from deleting the i th row and the j th 
column of A. 

In general, if A is an n x n matrix, then the ij th minor of A is the determinant of the 
n — 1 x n — 1 matrix which results from deleting the i th row and the j th column of A. 


Hence, there is a minor associated with each entry of A. Consider the following example 
which demonstrates this definition. 


Example 3.4: Finding Minors of a Matrix 


Let 


A 


Find minor (A) 12 and minor (A) 23 . 


12 3 
4 3 2 
3 2 1 


Solution. First we will find minor (A) 12 . By Definition 3.3, this is the determinant of the 
2x2 matrix which results when you delete the first row and the second column. This minor 
is given by 

minor (A) 12 = det 
Using Definition 3.1, we see that 


4 2 
3 1 


det 


4 2 
3 1 


(4) (1) — (3) (2) = 4 -6= -2 


Therefore minor (A) 19 = —2. 
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Similarly, minor ( A) 23 is the determinant of the 2x2 matrix which results when you 
delete the second row and the third column. This minor is therefore 


minor (. A) 23 = det 


1 2 
3 2 


= -4 


Finding the other minors of A is left as an exercise. 

The ij th minor of a matrix A is used in another important definition, given next. 


□ 


Definition 3.5: The ij th Cofactor of a Matrix 


Suppose A is an n x n matrix. The ij th cofactor, denoted by cof(A) i j is defined to 
be 

cof(A) ij = minor (A), 


Hj 


It is also convenient to refer to the cofactor of an entry of a matrix as follows. If is 
the ij th entry of the matrix, then its cofactor is just cof (A)^ . 


Example 3.6: Finding Cofactors of a Matrix 


Consider the matrix 


Find cof(A) 12 and cof (A) 


A = 


12 3 
4 3 2 
3 2 1 


23 ' 


Solution. We will use Definition 3.5 to compute these cofactors. 

First, we will compute cof(H) 12 . Therefore, we need to find minor (A) 12 . This is the 
determinant of the 2x2 matrix which results when you delete the first row and the second 
column. Thus minor (H) 19 is given by 


det 


4 2 
3 1 


-2 


Then, 


cof (A) 12 = (— 1) 1+2 minor (A) 12 = (— 1) 1+2 (—2) = 2 


Hence, cof(H) 12 = 2. 

Similarly, we can find cof(H) 23 . First, find minor {A) 23 , which is the determinant of the 
2x2 matrix which results when you delete the second row and the third column. This minor 


is therefore 


det 


1 2 
3 2 


-4 


Hence, 

cof(H) 23 = (— 1) 2+3 minor (H) 23 = (-1) 2+3 (-4) = 4 
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You may wish to find the remaining cofactors for the above matrix. Remember that 
there is a cofactor for every entry in the matrix. 

We have now established the tools we need to find the determinant of a 3 x 3 matrix. 


Definition 3.7: The Determinant of a Three By Three Matrix 


Let A be a 3 x 3 matrix. Then, det (A) is calculated by picking a row (or column) and 
taking the product of each entry in that row (column) with its cofactor and adding 
these products together. 

This process when applied to the i th row (column) is known as expanding along 
the I th row (column) as is given by 

det (R) = a il cof(A) il + a i2 cof(A)i 2 + a i3 cof(A) i3 


When calculating the determinant, you can choose to expand any row or any column. 
Regardless of your choice, you will always get the same number which is the determinant 
of the matrix A. This method of evaluating a determinant by expanding along a row or a 

column is called Laplace Expansion or Cofactor Expansion. 

Consider the following example. 


Example 3.8: Finding the Determinant of a Three by Three Matrix 


Let 


A 


12 3 
4 3 2 
3 2 1 


Find det (A) using the method of Laplace Expansion. 


Solution. First, we will calculate det (A) by expanding along the first column. Using Defini- 
tion 3.7, we take the 1 in the first column and multiply it by its cofactor, 


1 (- 1) 1+1 


3 2 
2 1 


( 1 )( 1 )(- 1 ) 1 


Similarly, we take the 4 in the first column and multiply it by its cofactor, as well as with 
the 3 in the first column. Finally, we add these numbers together, as given in the following 
equation. 


cof(A) 1 


cof(A) 2 


cofU) 3 


det (A) = 1(— 1) 


1+1 


3 2 
2 1 


+ 4 (— 1 ) 


2+1 


2 3 
2 1 


+ 3 (— 1 ) 


3+1 


2 3 

3 2 


Calculating each of these, we obtain 


det (A) = 1 (1) (-1) + 4 (-1) (-4) + 3 (1) (-5) = -1 + 16 + -15 = 0 
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Hence, det (A) = 0. 

As mentioned in Definition 3.7, we can choose to expand along any row or column. Let’s 
try now by expanding along the second row. Here, we take the 4 in the second row and 
multiply it to its cofactor, then add this to the 3 in the second row multiplied by its cofactor, 
and the 2 in the second row multiplied by its cofactor. The calculation is as follows. 


cof(A) 2 


C0f(A) 2 


cof(A) 2 




det (A) = 4(— 1) 


2+1 


+ 3(— 1) 


2+2 


+ 2 (— 1 ) 


2+3 


2 

2 


Calculating each of these products, we obtain 


det (A) = 4 (-1) (-2) + 3 (1) (-8) + 2 (-1) (-4) = 0 


You can see that for both methods, we obtained det (A) = 0. 


□ 


As mentioned above, we will always come up with the same value for det (A) regardless 
of the row or column we choose to expand along. You should try to compute the above 
determinant by expanding along other rows and columns. This is a good way to check your 
work, because you should come up with the same number each time! 

We present this idea formally in the following theorem. 



We have now looked at the determinant of 2 x 2 and 3x3 matrices. It turns out that 
the method used to calculate the determinant of a 3 x 3 matrix can be used to calculate the 
determinant of any sized matrix. Notice that Definition 3.3, Definition 3.5 and Definition 
3.7 can all be applied to a matrix of any size. 

For example, the ij th minor of a 4 x 4 matrix is the determinant of the 3x3 matrix you 
obtain when you delete the i th row and the j th column. Just as with the 3x3 determinant, 
we can compute the determinant of a 4 x 4 matrix by Laplace Expansion, along any row or 
column 

Consider the following example. 


r “ 

Example 3.10: Determinant of a 

four by Four Matrix 

Find det (A) where 

A — 

'1234' 
5 4 2 3 
13 4 5 

3 4 3 2 



Ill 


Solution. As in the case of a 3 x 3 matrix, you can expand this along any row or column. 
Lets pick the third column. Then, using Laplace Expansion, 



5 4 3 


12 4 

det (A) = 3(-l) 1+3 

1 3 5 

3 4 2 

+ 2 (— 1) 2+3 

1 3 5 

3 4 2 



12 4 


12 4 

4 (_ 1) 3 + 3 

5 4 3 

3 4 2 

+ 3 (— 1) 4+3 

5 4 3 

1 3 5 


Now, you can calculate each 3x3 determinant using Laplace Expansion, as we did above. 
You should complete these as an exercise and verify that det (A) = —12. □ 

The following provides a formal definition for the determinant of an n x n matrix. You 
may wish to take a moment and consider the above definitions for 2 x 2 and 3x3 determinants 
in context of this definition. 



In the following sections, we will explore some important properties and characteristics 
of the determinant. 


3.1.2. The Determinant of a Triangular Matrix 


There is a certain type of matrix for which finding the determinant is a very simple procedure. 
Consider the following definition. 
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Definition 3.12: Triangular Matrices 


A matrix A is upper triangular if CLij = 0 whenever i > j. Thus the entries of such 
a matrix below the main diagonal equal 0, as shown. Here, * refers to any nonzero 
number. 

* * • • • * 

0 * • • • : 

: : * 

_ 0 ■ • • 0 * _ 

A lower triangular matrix is defined similarly as a matrix for which all entries above 
the main diagonal are equal to zero. 


The following theorem provides a useful way to calculate the determinant of a triangular 
matrix. 


Theorem 3.13: Determinant of a Triangular Matrix 


Let A be an upper or lower triangular matrix. Then det (A) is obtained by taking the 
product of the entries on the main diagonal. 


The verification of this Theorem can be done by computing the determinant using Laplace 
Expansion along the first row or column. 

Consider the following example. 


Example 3.14: Determinant of a 

Triangular Matrix 

Let 

' 1 2 

3 

77 ' 


A = 

0 2 

6 

7 


0 0 

3 

33.7 



0 0 

0 

-1 


Find det (A) . 





Solution. From Theorem 3.13, it suffices to take the product of the elements on the main 
diagonal. Thus det (A) = 1 x 2 x 3 x (—1) = —6. 

Without using Theorem 3.13, you could use Laplace Expansion. We will expand along 
the first column. This gives 


det (A) = 



2 6 

7 



2 3 

77 


1 

0 3 33.7 

+ 0(- 

1) 2+1 

0 3 

33.7 

+ 


0 0 

-1 



0 0 

-1 




2 

3 77 



2 3 

77 

0 

:-i) 3+1 

2 6 7 

+ o(- 

-1) 4+1 

2 6 

7 



0 0-1 



0 3 

33.7 
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and the only nonzero term in the expansion is 


1 


2 6 7 

0 3 33.7 
0 0-1 


Now find the determinant of this 3x3 matrix, by expanding along the first column to obtain 


det (A) = 1 x ( 2 x 


3 33.7 
0 -1 


+ o(-i) 


2+1 


6 7 

0 -1 


+ o(-i) 


3+1 


6 7 

3 33.7 


= 1 x 2 x 


3 33.7 
0 -1 


Next use Definition 3.1 to find the determinant of this 2x2 matrix, which is just 3 x —1 — 
0 x 33.7 = —3. Putting all these steps together, we have 


det {A) = 1 x 2 x 3 x (-1) = -6 


which is just the product of the entries down the main diagonal of the original matrix! □ 

You can see that while both methods result in the same answer, Theorem 3.13 provides 
a much quicker method. 

In the next section, we explore some important properties of determinants. 


3.1.3. Properties of Determinants 


There are many important properties of determinants. Since many of these properties involve 
the row operations discussed in Chapter 1, we recall that definition now. 



We will now consider the effect of row operations on the determinant of a matrix. In 
future sections, we will see that using the following properties can greatly assist in finding 
determinants. 

The first theorem explains the affect on the determinant of a matrix when two rows are 
switched. 
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Theorem 3.16: Switching Rows 


Let A be an n x n matrix and let B be a matrix which results from switching two rows 
of A. Then det (B) = — det (A) . 


When we switch two rows of a matrix, the determinant is multiplied by —1. Consider 
the following example. 


Example 3.17: Switching Two Rows 

Let A = 

' 12 ' 
3 4 

and let B = 

h- 1 CO 

to ^ 

. Knowing that det (A) = —2, find det (R). 


Solution. By Definition 3.1, det (A) = 1x4 — 3x2 = —2. Notice that the rows of B are 
the rows of A but switched. By Theorem 3.16 since two rows of A have been switched, 
det (B) = — det (A) = — (—2) = 2. You can verify this using Definition 3.1. □ 

The next theorem demonstrates the effect on the determinant of a matrix when we 
multiply a row by a scalar. 


Theorem 3.18: Multiplying a Row by a Scalar 


Let A be an n x n matrix and let B be a matrix which results from multiplying some 
row of A by a scalar k. Then det (B) = k det (A). 


Notice that this theorem is true when we multiply one row of the matrix by k. If we were 
to multiply two rows of A by k to obtain B, we would have det (B) = k 2 det (A). Suppose 
we were to multiply all n rows of A by k to obtain the matrix R, so that B = kA. Then, 
det (B) = k n det (A). This gives the next theorem. 



Consider the following example. 


Example 3.20: Multiplying a 

Row by 5 

Let A = 

CO 

^ to 

, B = 

' 5 10 ' 
3 4 

. Knowing that det (T) = —2, find det (R). 


Solution. By Definition 3.1, det (A) = —2. We can also compute det (R) using Definition 
3.1, and we see that det (R) = —10. 
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Now, let’s compute det ( B ) using Theorem 3.18 and see if we obtain the same answer. 
Notice that the first row of B is 5 times the first row of A, while the second row of B is 
equal to the second row of A. By Theorem 3.18, det (B) = 5 x det (A) = 5 x — 2 = —10. 
You can see that this matches our answer above. □ 

Finally, consider the next theorem for the last row operation, that of adding a multiple 
of a row to another row. 


Theorem 3.21: Adding a Multiple of a Row to Another Row 


Let A be an n x n matrix and let B be a matrix which results from adding a multiple 
of a row to another row. Then det (A) = det (B). 


Therefore, when we add a multiple of a row to another row, the determinant of the matrix 
is unchanged. Note that if a matrix A contains a row which is a multiple of another row, 
det (A) will equal 0. To see this, suppose the first row of A is equal to —1 times the second 
row. By Theorem 3.21, we can add the first row to the second row, and the determinant 
will be unchanged. However, this row operation will result in a row of zeros. Using Laplace 
Expansion along the row of zeros, we find that the determinant is 0. 

Consider the following example. 


Example 3.22: Adding a Row to Another Row 

Let A = 

CO 

^ bO 

and let B = 

' 1 2 ' 

5 8 

. Find det ( B ). 


Solution. By Definition 3.1, det (A) = —2. Notice that the second row of B is two times 
the Erst row of A added to the second row. By Theorem 3.16, det (B) = det (A) = —2. As 
usual, you can verify this answer using Definition 3.1. □ 


Example 3.23: Multiple of a Row 

Let A = 

to ^ 

^ to 

. Show that det (A) = 0. 


Solution. Using Definition 3.1, the determinant is given by 

det (A) = 1 x 4 - 2 x 2 = 0 

However notice that the second row is equal to 2 times the first row. Then by the 
discussion above following Theorem 3.21 the determinant will equal 0. □ 

Until now, our focus has primarily been on row operations. However, we can carry out the 
same operations with columns, rather than rows. The three operations outlined in Definition 
3.15 can be done with columns instead of rows. In this case, in Theorems 3.16, 3.18, and 
3.21 you can replace the word, ’’row” with the word ’’column”. 
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There are several other major properties of determinants which do not involve row (or 
column) operations. The first is the determinant of a product of matrices. 



In order to find the determinant of a product of matrices, we can simply take the product 
of the determinants. 

Consider the following example. 


r ■ 

Example 3.25: The Determinant of a 

Product 

Compare det (AB) and det (A) d< 

A = 

3t (B) for 

12" 
-3 2 

,B = 

"32' 
4 1 



Solution. First compute AB , which is given by 


AB = 


and so by Definition 3.1 


Now 


and 


CM 

t— H 

1 


"32" 


i 

l 

-3 2 


4 1 


i 

1 

! 

4^ 

1 


det (AB) = det 

det (A) = det 
det (B) = det 


11 4 

-1 -4 

1 2 
-3 2 

3 2 

4 1 


= -40 


= -5 


Computing det (A) x det ( B ) we have 8 x —5 = —40. This is the same answer as above 
and you can see that det (7L) det (B) = 8 x (—5) = —40 = det (AB). □ 


Consider the next important property. 
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This theorem is illustrated in the following example. 



Solution. First, note that 


Using Definition 3.1, we can compute det (A) and det (A T ). It follows that det (A) = 
2x3 — 4x5 = —14 and det ( A T ) = 2x3 — 5x4 = —14. Hence, det (A) = det ( A T ) . □ 

The following provides an essential property of the determinant, as well as a useful way 
to determine if a matrix is invertible. 



Consider the following example. 


Example 3.29: Determinant of an Invertible Matrix 

Let A = 

find the a 

'36' 
2 4 
letermir. 

,B = 

iant of t 

'23' 
5 1 
he inve 

. For each matrix, determine if it is invertible. If so, 

rse. 


Solution. Consider the matrix A first. Using Definition 3.1 we can find the determinant as 
follows: 

det (A) = 3 x 4 - 2 x 6 = 12 - 12 = 0 

By Theorem 3.28 A is not invertible. 

Now consider the matrix B. Again by Definition 3.1 we have 

det (B) = 2x1 — 5x3 = 2 — 15 = —13 

By Theorem 3.28 B is invertible and the determinant of the inverse is given by 

let T) = d^4) 

1 

—13 

1 

_ 13 
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3.1.4. Finding Determinants using Row Operations 


□ 


Theorems 3.16, 3.18 and 3.21 illustrate how row operations affect the determinant of a 
matrix. In this section, we look at two examples where row operations are used to find 
the determinant of a large matrix. Recall that when working with large matrices, Laplace 
Expansion is effective but timely, as there are many steps involved. This section provides 
useful tools for an alternative method. By first applying row operations, we can obtain a 
simpler matrix to which we apply Laplace Expansion. 

While working through questions such as these, it is useful to record your row operations 
as you go along. Keep this in mind as you read through the next example. 



Solution. We will use the properties of determinants outlined above to find det (A). First, 
add —5 times the first row to the second row. Then add —4 times the first row to the third 
row, and —2 times the first row to the fourth row. This yields the matrix 



1 

2 

3 

4 

0 

-9 

-13 

-17 

0 

-3 

-8 

-13 

0 

-2 

-10 

-3 


Notice that the only row operation we have done so far is adding a multiple of a row to 
another row. Therefore, by Theorem 3.21, det (B) = det (A) . 

At this stage, you could use Laplace Expansion to find det (B). However, we will continue 
with row operations to find an even simpler matrix to work with. 

Add —3 times the third row to the second row. By Theorem 3.21 this does not change 
the value of the determinant. Then, multiply the fourth row by —3. This results in the 
matrix 


12 3 4 

0 0 11 22 

0 -3 -8 -13 

0 6 30 9 


Here, det ( C ) = —3 det (B), which means that det (B) = (— |) det (C) 


119 


Since det (A) = det ( B ), we now have that det (A) = (— |) det ( C ). Again, you could use 
Laplace Expansion here to find det ( C ). However, we will continue with row operations. 

Now replace the add 2 times the third row to the fourth row. This does not change the 
value of the determinant by Theorem 3.21. Finally switch the third and second rows. This 
causes the determinant to be multiplied by —1. Thus det (C) = — det ( D ) where 


12 3 4 

0 -3 -8 -13 
0 0 11 22 
0 0 14 -17 


Hence, det (A) = (— |) det (C) = (|) det ( D ) 

You could do more row operations or you could note that this can be easily expanded 
along the first column. Then, expand the resulting 3x3 matrix also along the first column. 
This results in 


det ( D ) = 1 (—3) 


11 22 
14 -17 


1485 


and so det (A) = (|) (1485) = 495. 


□ 


You can see that by using row operations, we can simplify a matrix to the point where 
Laplace Expansion involves only a few steps. In Example 3.30, we also could have continued 
until the matrix was in upper triangular form, and taken the product of the entries on the 
main diagonal. Whenever computing the determinant, it is useful to consider all the possible 
methods and tools. 

Consider the next example. 



Solution. Once again, we will simplify the matrix through row operations. Add —1 times 
the first row to the second row. Next add —2 times the first row to the third and finally 
take —3 times the first row and add to the fourth row. This yields 


1 2 3 2 



0 -10 -8 -4 


By Theorem 3.21, det (A) = det ( B ). 
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Remember you can work with the columns also. Take —5 times the fourth column and 
add to the second column. This yields 


C = 


1-832 
0 0 - 1-1 
0-8-4 1 

0 10 -8 -4 


By Theorem 3.21 det (A) = det (C). 

Now take —1 times the third row and add to the top row. This gives. 


D = 


10 7 1 

0 0 - 1-1 
0-8-4 1 

0 10 -8 -4 


which by Theorem 3.21 has the same determinant as A. 

Now, we can find det (D) by expanding along the first column as follows. You can see 
that there will be only one non zero term. 


det (D) = 1 det 


0 -1 -1 
-8 -4 1 

10 -8 -4 


+ 0 + 0+0 


Expanding again along the first column, we have 

-1 


det (D) = 1(0 + 8 det 


-1 

-4 


+ 10 det 


-1 

-4 


= -82 


Now since det (A) = det (D), it follows that det (A) = —82. 


□ 


Remember that you can verify these answers by using Laplace Expansion on A. Similarly, 
if you first compute the determinant using Laplace Expansion, you can use the row operation 
method to verify. 


3.1.5. Exercises 


1. Find the determinants of the following matrices. 


(b) 


1 3 
0 2 

0 3 
0 2 

4 3 
6 2 
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12 4' 

2. Let A = 0 13. Find the following. 

-2 5 1 

(a) minor (A) n 

(b) minor (A) 2 i 

(c) minor (A) 32 

(d) cof(A)n 

(e) cof(A) 2 i 

(f) cof(A) 32 

3. Find the determinants of the following matrices. 

'12 3' 

(a) 322 

0 9 8 

" 4 3 2 " 

(b) 1 7 8 

3-9 3 

1 2 3 2 " 

13 2 3 

4 15 0 

12 12 

4. Find the following determinant by expanding along the first row and second column. 

12 1 
2 13 
2 11 

5. Find the following determinant by expanding along the first column and third row. 

12 1 
1 0 1 
2 11 

6. Find the following determinant by expanding along the second row and first column. 

12 1 
2 13 
2 11 

7. Compute the determinant by cofactor expansion. Pick the easiest row or column to 
use. 

10 0 1 
2 110 
0 0 0 2 
2 13 1 
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8. Find the determinant of the following matrices. 


(a) A 

(b) A 


(c) A 


1 -34 
0 2 

4 3 14 ' 

0-2 0 
0 0 5 _ 

2 3 15 0 

0 4 17 

0 0-35 
0 0 0 1 


9. An operation is done to get from the first matrix to the second. Identify what was 
done and tell how it will affect the value of the determinant. 


a b 
c d 




a c 
b d 


10. An operation is done to get from the first matrix to the second. Identify what was 
done and tell how it will affect the value of the determinant. 

a b 
c d 




c d 
a b 


11. An operation is done to get from the first matrix to the second. Identify what was 
done and tell how it will affect the value of the determinant. 


a b 
c d 




a b 
a + c b + d 


12. An operation is done to get from the first matrix to the second. Identify what was 
done and tell how it will affect the value of the determinant. 


a b 



a 

b 

c d 

->■ • • 

• • ->■ 

2c 

2d 


13. An operation is done to get from the first matrix to the second. Identify what was 
done and tell how it will affect the value of the determinant. 


a b 
c d 




b a 
d c 


14. Let A be an r x r matrix and suppose there are r — 1 rows (columns) such that all rows 
(columns) are linear combinations of these r — 1 rows (columns). Show det (A) = 0. 

15. Show det (a A) = a n det (A) for an n x n matrix A and scalar a. 
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16. Construct 2x2 matrices A and B to show that the det A det B = det(AB). 

17. Is it true that det (A + B) = det (A) + det ( B)1 If this is so, explain why. If it is not 
so, give a counter example. 

18. An n x n matrix is called nilpotent if for some positive integer, k it follows A k = 0. If 
A is a nilpotent matrix and k is the smallest possible integer such that A k = 0, what 
are the possible values of det (A)? 

19. A matrix is said to be orthogonal if A T A = I. Thus the inverse of an orthogonal 
matrix is just its transpose. What are the possible values of det (A) if A is an orthogonal 
matrix? 

20. Let A and B be two n x n matrices. A ~ B (A is similar to B) means there 
exists an invertible matrix P such that A = P~ 1 BP. Show that if A ~ B, then 
det (A) = det ( B ) . 

21. Tell whether each statement is true or false. If true, provide a proof. If false, provide 
a counter example. 

(a) If A is a 3 x 3 matrix with a zero determinant, then one column must be a multiple 
of some other column. 

(b) If any two columns of a square matrix are equal, then the determinant of the 
matrix equals zero. 

(c) For two n x n matrices A and B, det (A + B) = det (A) + det ( B ) . 

(d) For an n x n matrix A, det (3A) = 3 det (A) 

(e) If A -1 exists then det (A -1 ) = det (A) -1 . 

(f) If B is obtained by multiplying a single row of A by 4 then det ( B ) = 4 det (A) . 

(g) For A an n x n matrix, det (—A) = (—1)" det (A) . 

(h) If A is a real n x n matrix, then det (A T A) > 0. 

(i) If A k = 0 for some positive integer k, then det (A) = 0. 

(j) If AX = 0 for some X ^ 0, then det (A) = 0. 

22. Find the determinant using row operations to first simplify. 

12 1 
2 3 2 
-4 12 

23. Find the determinant using row operations to first simplify. 

2 1 3 
2 4 2 
14-5 
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24. Find the determinant using row operations to first simplify. 


12 12 
3 1-2 3 

-10 3 1 

2 3 2 -2 

25. Find the determinant using row operations to first simplify. 

14 12 

3 2-2 3 

-10 3 3 

2 1 2-2 

3.2 Applications of the Determinant 


Outcomes 


A. Use determinants to determine whether a matrix has an inverse , and evaluate 
the inverse using cofactors. 

B. Apply Cramer’s Rule to solve a2x2ora3x3 linear system. 

C. Given data points , hnd an appropriate interpolating polynomial and use it to 
estimate points. 


3.2.1. A Formula for the Inverse 


The determinant of a matrix also provides a way to find the inverse of a matrix. Recall the 
definition of the inverse of a matrix in Definition 2.33. We say that A -1 , an n x n matrix, 
is the inverse of A, also n x n, if AA^ 1 = I and A -1 A = I. 

We now define a new matrix called the cofactor matrix of A. The cofactor matrix of 
A is the matrix whose ij th entry is the ij th cofactor of A. The formal definition is as follows. 



Note that cof {A) v ,- denotes the ij th entry of the cofactor matrix. 
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We will use the cofactor matrix to create a formula for the inverse of A. First, we define 
the adjugate of A to be the transpose of the cofactor matrix. We can also call this matrix 
the classical adjoint of A, and we denote it by adj (A). 

In the specific case where A is a 2 x 2 matrix given by 


A 


a b 
c d 


then adj (A) is given by 


adj (A) = 


d 

—c 


-b 

a 


In general, adj (A) can always be found by taking the transpose of the cofactor matrix of 
A. The following theorem provides a formula for A -1 using the determinant and adjugate 
of A. 



The proof of this Theorem is below, after two examples demonstrating this concept. 
Notice that this formula is only defined when det (A) 7 ^ 0. 

Consider the following example. 


. . ; 1 

Example 3.34: Find Inverse Using the Determinant 

Find the inverse of the matrix 

A = 

1 

OO 

0 to 

h- 1 CO 

1 


using the formula in Theorem 3.33. 

[1 2 Ij 



Solution. According to Theorem 3.33, 

A - 1 


1 

det (A) 


adj (A) 


First we will find the determinant of this matrix. Using Theorems 3.16, 3.18, and 3.21, 
we can first simplify the matrix through row operations. First, add —3 times the first row 
to the second row. Then add —1 times the first row to the third row to obtain 


B 


1 2 

0 -6 

0 0 


3 


-8 


-2 
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By Theorem 3.21, det (A) = det ( B ). By Theorem 3.13, det (B) = 1 x — 6 x — 2 = 12. Hence, 
det (H) = 12. 

Now, we need to find adj (H). To do so, first we will find the cofactor matrix of A. This 
is given by 


cof (H) 


-2 -2 6 
4-2 0 

2 8-6 


Here, the ij th entry is the ij th cofactor of the original matrix A which you can verify. 
Therefore, from Theorem 3.33, the inverse of A is given by 


H -1 = 


1 

12 


-2 

4 

2 


-2 

-2 

8 


0 

-6 


1 

6 

1 

6 

1 

2 


1 1 " 

3 6 


1 2 

6 3 



Remember that we can always verify our answer for H -1 . Compute the product AA^ 1 
and A~ 1 A and make sure each product is equal to /. 

Compute A~ X A as follows 


A- 1 A = 



" 1 

2 

1 

CO 


1 

0 

1 

o 

3 

0 

1 

= 

0 

1 

0 

1 

2 

1 


o 

0 

1 


You can verify that AA 1 = I and hence our answer is correct. 


□ 


We will look at another example of how to use this formula to find A 1 


r ■ 

Example 3.35: Find the Inverse From 

a 

1 

Formula 

Find the inverse of the matrix 





l 

o 

l ■ 



2 


2 



l 

i 

1 


A = 

6 

3 

2 



5 

2 

1 



6 

3 

2 


using the formula given in Theorem 3.33. 




Solution. First we need to find det (H). This step is left as an exercise and you should verify 
that det (H) = |. The inverse is therefore equal to 


H " 1 


1 

(W 


adj (H) 


6 adj (H) 
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We continue to calculate as follows. Here we show the 2x2 determinants needed to find 
the cofactors. 


A -i = 6 


Expanding all the 2x2 determinants, this yields 


A -1 = 6 


I - T 
6 

1 

'3 

1 
6 


1 

2 

1 


2 

1 

-2 


- 

1 1 


1 1 


1 1 

" 


3 2 


6 2 


6 3 



2 1 

— 

5 1 


5 2 



3 2 


6 2 


6 3 



0 1 


1 1 

2 2 


1 0 


— 

2 1 


5 1 

— 

5 2 



3 2 


6 2 


6 3 



0 1 


1 1 

2 2 


1 0 



1 1 

— 

1 1 


1 1 



3 2 


6 2 


6 3 


- 




- 


Again, you can always check your work by multiplying A 1 A and AA 1 and ensuring 
these products equal /. 





1 

2 

0 

1 ' 

2 





1 

2 

-1 


1 

1 

1 


1 

0 

0 ' 

A -1 A = 

2 

1 

1 


6 

3 

2 

= 

0 

1 

0 


1 

-2 

1 


5 

2 

1 


0 

0 

1 






6 

3 

2 






This tells us that our calculation for A 1 is correct. It is left to the reader to verify that 
AA' 1 = I. □ 


The verification step is very important, as it is a simple way to check your work! If you 
multiply A -1 A and AA- 1 and these products are not both equal to I, be sure to go back and 
double check each step. One common error is to forget to take the transpose of the cofactor 
matrix, so be sure to complete this step. 

We will now prove Theorem 3.33. 

Proof. First if A is invertible, then by Theorem ?? we have: 

1 = det (/) = det (AA -1 ) = det (A) det (A -1 ) 

and thus det (A) ^ 0. 

Equivalently, if det (A) = 0, then A is not invertible. 

Now assume det (A) ^ 0. From the definition of the determinant in terms of expansion 
along a column, and letting A = [a ir \ . if det (A) ^ 0, 

n 

a ir coi (A) ir det (A) -1 = det (A) det (A) -1 = 1 

2=1 
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Now consider 

n 

Y aircof (A) ik det(H)" 1 
2=1 

when k 7 ^ r. Replace the k th column with the r th column to obtain a matrix B k whose 
determinant equals zero by Theorem 3.16. However, expanding this matrix Bk along the k th 
column yields 

n 

0 = det (B k ) det (H) _1 = a ir cof (A) ik det (H)^ 1 

2=1 

Summarizing, 

Y «ir-cof (A) ik det (Ay 1 = 5 rk = { J ^ ~ * 

i = 1 ^ ' 

Now 

n n 

Y a ir C0f (A) ik = Y °^ C0f ( A )ki 
2=1 2=1 

which is the kr th entry of cof (A) T A. Therefore, 


cof(Hf 
det (H) 


(3.1) 


Using the other formula in Definition 3.11, and similar reasoning, 

n 

Y ® r j cof (A) kj det (H)' 1 = S rk 

3 = 1 


Now 

n n 

Y a rj cof (A) kj = Y «n c °f ( A )Jk 
j = 1 i =1 

which is the rk th entry of Hcof (A) T . Therefore, 

cofprf 

det (A) 


(3.2) 


and it follows from 3.1 and 3.2 that A 1 = [a^-] *, where 


[ay 1 = cof (A) ji det (H) 1 


In other words, 

i = cof (Af 
det (A) 


□ 
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This method for finding the inverse of A is useful in many contexts. In particular, it is 
useful with complicated matrices where the entries are functions, rather than numbers. 
Consider the following example. 



Solution. First note det ( A (£)) = e*(cos 2 t + sin 2 1 ) 
The cofactor matrix is 


e l 7^ 0 so A (• t ) 1 exists. 


C(t) 


1 0 0 

0 e t cos t e t sin t 
0 — e b sin t e t cos t 


and so the inverse is 


■ 1 

0 

0 

T 

e t 

0 

0 

0 

e* cos t 

e t sin t 

= 

0 

cos t 

— sin t 

0 

—e t sin t, 

e* cos t 


0 

sinf 

cos t 


□ 


3.2.2. Cramer’s Rule 


Another context in which the formula given in Theorem 3.33 is important is Cramer’s 
Rule. Recall that we can represent a system of linear equations in the form AX = B, where 
the solutions to this system are given by X. Cramer’s Rule gives a formula for the solutions 
X in the special case that A is a square invertible matrix. Note this rule does not apply 
if you have a system of equations in which there is a different number of equations than 
variables (in other words, when A is not square), or when A is not invertible. 

Suppose we have a system of equations given by AX = B , and we want to find solutions 
X which satisfy this system. Then recall that if A ~ 1 exists, 

AX = B 
A" 1 (AX) = A~ X B 
(A -1 A) X = A~ l B 
IX = A~ X B 
X = A~ X B 
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Hence, the solutions X to the system are given by X = A~ 1 B. Since we assume that 
A -1 exists, we can use the formula for A -1 given above. Substituting this formula into the 
equation for A", we have 

X = A ±B= (let (A ) adj ^ B 

Let x % be the i th entry of X and bj be the j th entry of B. Then this equation becomes 

n n i 

x i = ^2 K-r 1 b i = Y ddTM) adj bj 

3=1 3=1 ^ 

where adj {A) tj is the ij th entry of adj (A). 

By the formula for the expansion of a determinant along a column, 

* ■ ■ • b\ • ■ • * 


* • • • b n ■ ■ ■ * 

where here the i th column of A is replaced with the column vector [iq • • • •, b n ] T . The deter- 
minant of this modified matrix is taken and divided by det (A). This formula is known as 
Cramer’s rule. 

We formally define this method now. 



Xi = 


det (A) 


det 


We illustrate this procedure in the following example. 


Example 3.38: Using Cramer’s Rule 


Find x , y, z if 


' 1 

2 1 ' 


X 


~T~ 

3 

2 

2 1 

-3 2 


y 

z 


2 

3 
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Solution. We will use method outlined in Procedure 3.37 to find the values for x,y,z which 
give the solution to this system. Let 


B = 


1 

2 

3 


In order to find x, we calculate 


det (Hi) 
det (A) 


where Hi is the matrix obtained from replacing the first column of A with B. 
Hence, Hi is given by 


Ai — 


1 2 1 

2 2 1 

3-3 2 


Therefore, 


det (Hi) 
det (H) 


1 2 1 
2 2 1 
3-3 2 1 

T 2 1 “ 2 

3 2 1 

2-3 2 


Similarly, to find y we construct A 2 by replacing the second column of A with B. Hence, 
A 2 is given by 


A 2 


111 
3 2 1 
2 3 2 


Therefore, 


det (A 2 ) 
det (H) 


111 
3 2 1 
2 3 2 

T 2 1 

3 2 1 

2-3 2 


1 

7 


by 


Similarly, H 3 is constructed by replacing the third column of A with B. Then, H 3 is given 


H3 


1 2 1 

3 2 2 

2-3 3 


Therefore, z is calculated as follows. 
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det (A 3 ) 
det (A) 


1 2 1 

3 2 2 

2-3 3 _ 11 

T 2 1 ~~ 14 

3 2 1 

2-3 2 


□ 


Cramer’s Rule gives you another tool to consider when solving a system of linear equa- 
tions. 

We can also use Cramer’s Rule for systems of non linear equations. Consider the following 
system where the matrix A has functions rather than numbers for entries. 


Example 3.39: Use Cramer’s Rule for Non-Constant Matrix 


Solve for z if 

"10 0 
0 e t cos t e t sin t 

0 —el sin t e*cos t 


r i 

t 


Solution. We are asked to find the value of £ in the solution. We will solve using Cramer’s 
rule. Thus 


1 0 1 
0 e* cos t t 
0 —e l sin t t 2 

T o o 

0 e t cos t e t sin t 
0 — e*sinf e*cos t 


t ((cost) t + sin t) e * 


□ 


3.2.3. Polynomial Interpolation 

In studying a set of data that relates variables x and y , it may be the case that we can use 
a polynomial to “fit” to the data. If such a polynomial can be established, it can be used to 
estimate values of x and y which have not been provided. 

Consider the following example. 


Example 3.40: Polynomial Interpolation 


Given data points (1, 4), (2, 9), (3, 12), find an interpolating polynomial p(x) of degree 
at most 2 and then estimate the value corresponding to x — 
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Solution. We want to find a polynomial given by 

p(x) = r 0 + riXi + r 2 x 2 

such that p( 1) = 4,p(2) = 9 and p( 3) = 12. To find this polynomial, substitute the known 
values in for x and solve for r 0 ,ri, and r 2 . 

p( 1) = r 0 + ri + r 2 = 4 
p(2) = r 0 + 2ri + 4r 2 = 9 
p(3) = r 0 + 3ri + 9r 2 = 12 

Writing the augmented matrix, we have 


' 1 

1 

1 

4 ' 

1 

2 

4 

9 

1 

3 

9 

12 


After row operations, the resulting matrix is 


' 1 

0 

0 

1 

CO 

1 

0 

1 

0 

00 

O 

1 

0 

1 

1 

T— I 

1 


Therefore the solution to the system is r Q = ?-3, ry = 8,r 2 = —1 and the required 
interpolating polynomial is 

p(x) = —3 + 8x — x 2 

To estimate the value for x = we calculate p{\). 

,1 

P ( 3) = -3 + 8(3)- 

1 

= -3 + 4-- 
4 

3 

4 



□ 

This procedure can be used for any number of data points, and any degree of polynomial. 
The steps are outlined below. 
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Procedure 3.41: Finding an Interpolating Polynomial 


Suppose that values of x and corresponding values of y are given, such that the actual 
relationship between x and y is unknown. Then, values of y can he estimated using an 
interpolating polynomial p(x). If given X\, .... x n and the corresponding ]j\. .... y n , 
the procedure to find p(x) is as follows: 

1. The desired polynomial p(x) is given by 

p(x) = r 0 + r\X + r 2 x 2 + ... + r n _ix n ~ l 

2. p(xi) = y l for all i = 1, 2, ..., n so that 

r o + r\X\ + r 2 x\ + ... + r n _ rx^ 1 = yi 
r o + rex 2 + r 2 x \ + ... + r n _ ix 2 _1 = 2/2 

+ ryxn + r 2 x 2 n + ... + = y n 

3. Set up the augmented matrix of this system of equations 


" 1 

X\ 

~2 . 

.1 1 

• x-- 1 

2/1 

1 

X 2 

. to to 

,y> n 1 

x 2 

2/2 

1 

X n 

. 

rf.n—1 

2 In 


4. Solving this system will result in a unique solution r 0 ,r 1 , ■ ■ ■ ,r n _i. Use these 
values to construct p(x), and estimate the value of p(a) for any x = a. 


This procedure motivates the following theorem. 


Theorem 3.42: Polynomial Interpolation 


Given n data points (xi, y i), (x 2 , y 2 ), ■ ■ ■ , ( x n , y n ) with the Xi distinct, there is a unique 
polynomial p[x) = ro+rix+r 2 x 2 +- ■ -+r n _ ix n ~ 1 such thatp(xi) = yi fori = 1, 2, • • • ,n. 
The resulting polynomial p(x) is called the interpolating polynomial for the data 
points. 


We conclude this section with another example. 


Example 3.43: Polynomial Interpolation 


Consider the data points (0, 1), (1, 2), (3, 22), (5, 66). Find an interpolating polynomial 
p(x) of degree at most three, and estimate the value of p( 2). 
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Solution. The desired polynomial p(x) is given by: 


p(x) = r 0 + r i x + r 2 x 2 + r 3 x 3 
Using the given points, the system of equations is 

P( 0) = r 0 = 1 

p( 1) = r 0 + n + r 2 + r 3 = 2 

p( 3) = r 0 + 3ri + 9 r 2 + 27 r 3 = 22 
p(5) = r 0 + 5ri + 25 r 2 + 125r 3 = 66 

The augmented matrix is given by: 


' 1 

0 

0 

0 

1 ' 

1 

1 

1 

1 

2 

1 

3 

9 

27 

22 

1 

5 

25 

125 

66 


The resulting matrix is 


' 1 

0 

0 

0 

1 ' 

0 

1 

0 

0 

-2 

0 

0 

1 

0 

3 

O 

1 

0 

0 

1 

1 

o 


Therefore, r 0 = 1, r t = —2, r 2 = 3, r 3 = 0 and p(x) = 1 — 2x + 3x 2 . To estimate the value 
of p( 2), we compute p{2) = 1 — 2(2) + 3(2 2 ) = 1 — 4 + 12 = 9. □ 


3.2.4. Exercises 


1. Let 


A = 


12 3 
0 2 1 
3 1 0 


Determine whether the matrix A has an inverse by finding whether the determinant 
is non zero. If the determinant is nonzero, find the inverse using the formula for the 
inverse which involves the cofactor matrix. 


2. Let 


A = 


12 0 
0 2 1 
3 11 


Determine whether the matrix A has an inverse by finding whether the determinant 
is non zero. If the determinant is nonzero, find the inverse using the formula for the 
inverse. 
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3. Let 


A = 


1 3 3 

2 4 1 
Oil 


Determine whether the matrix A has an inverse by hireling whether the determinant 
is non zero. If the determinant is nonzero, find the inverse using the formula for the 
inverse. 


4. Let 


A = 


12 3 
0 2 1 
2 6 7 


Determine whether the matrix A has an inverse by finding whether the determinant 
is non zero. If the determinant is nonzero, find the inverse using the formula for the 
inverse. 


5. Let 


A = 


1 0 3 
1 0 1 
3 1 0 


Determine whether the matrix A has an inverse by finding whether the determinant 
is non zero. If the determinant is nonzero, find the inverse using the formula for the 
inverse. 


6. For the following matrices, determine if they are invertible. If so, use the formula for 
the inverse in terms of the cofactor matrix to find each inverse. If the inverse does not 
exist, explain why. 


(a) 

(b) 


1 1 
1 2 

12 3 
0 2 1 
4 1 1 

12 1 
2 3 0 
0 12 


7. Consider the matrix 


A = 


1 0 0 
0 cos t — sin t 
0 sin t cos t 


Does there exist a value of t for which this matrix fails to have an inverse? Explain. 
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8. Consider the matrix 


A = 


1 t t 2 
0 12 1 
t 0 2 


Does there exist a value of t for which this matrix fails to have an inverse? Explain. 


9. Consider the matrix 


A 


e t cosh t sinh t 
e 4 sinh t cosh t 
e t cosh t sinh t 


Does there exist a value of t for which this matrix fails to have an inverse? Explain. 


10. Consider the matrix 


A = 



e _t cos t e _t sin t 

— e _t cos t — e _t sin t — e - * sin t + e _t cos t 
2e _t sin t — 2e _t cos t 


Does there exist a value of t for which this matrix fails to have an inverse? Explain. 

11. Show that if det (A) ^ 0 for A an nxn matrix, it follows that if AX = 0, then X = 0. 

12. Suppose A,B are n x n matrices and that AB = I. Show that then BA = I. Hint: 
First explain why det (A) , det (B) are both nonzero. Then (AB) A = A and then show 
BA (BA — /) = 0. From this use what is given to conclude A (BA — /) = 0. Then use 
Problem 11. 


13. Use the formula for the inverse in terms of the cofactor matrix to find the inverse of 
the matrix 

e* 0 0 

A = 0 e* cos t e* sin t 

0 e t cos t — e t sin t e t cos t + e t sin t 


14. Find the inverse, if it exists, of the matrix 


A 


e t cos t sin t 

e t — sin t cos t 
e t — cos t — sin t 


15. Suppose A is an upper triangular matrix. Show that A~ l exists if and only if all 
elements of the main diagonal are non zero. Is it true that A will also be upper 
triangular? Explain. Could the same be concluded for lower triangular matrices? 

16. If A, B , and C are each nxn matrices and ABC is invertible, show why each of A, B, 
and C are invertible. 

17. Decide if this statement is true or false: Cramer’s rule is useful for finding solutions 
to systems of linear equations in which there is an infinite set of solutions. 
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18. Use Cramer’s rule to find the solution to 


x + 2y = 1 
2x — y = 2 

19. Use Cramer’s rule to find the solution to 

x + 2y + z = 1 
2x — y — z = 2 
x + z = 1 
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4. R n 


4.1 Vectors in R n 


Outcomes 


A. Find the position vector of a point in 


The notation R n refers to the collection of ordered lists of n real numbers, that is 
M” = {(xi • • • x n ) : Xj E 1 for j — 1, • • ■ , n} 

In this chapter, we take a closer look at vectors in M n . First, we will consider what M n looks 
like in more detail. Recall that the point given by 0 = (0, • • • , 0) is called the origin. 

Now, consider the case of M n for n — 1. Then from the definition we can identify M. with 
points in M 1 as follows: 

M = M 1 = {(xi) : Xi E M} 

Hence, M is defined as the set of all real numbers and geometrically, we can describe this as 
all the points on a line. 

Now suppose n = 2. Then, from the definition, 

M 2 = {(xi, x 2 ) : Xj E R for j = 1, 2} 

Consider the familiar coordinate plane, with an x axis and a y axis. Any point within this 
coordinate plane is identified by where it is located along the x axis, and also where it is 
located along the y axis. Consider as an example the following diagram. 
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y r 


Q = ( 

-3,4) 



< 

► 4 





P = 

(2,1) 


1 - 

t 



3 

r 



Hence, every element in M 2 is identified by two components, x and y, in the usual manner. 
The coordinates x,y (or aq,^) uniquely determine a point in the plan. Note that while the 
definition uses X\ and x 2 to label the coordinates and you may be used to x and y, these 
notations are equivalent. 

Now suppose n — 3. You may have previously encountered the 3-dimensional coordinate 
system, given by 

M 3 = {(aq, x 2 , X 3 ) : Xj G R for j = 1, 2, 3} 

Points in M 3 will be determined by three coordinates, often written (x, y, z ) which corre- 
spond to the x, y, and z axes. We can think as above that the first two coordinates determine 
a point in a plane. The third component determines the height above or below the plane, 
depending on whether this number is positive or negative, and all together this determines 
a point in space. Yon see that the ordered triples correspond to points in space just as the 
ordered pairs correspond to points in a plane and single real numbers correspond to points 
on a line. 

The idea behind the more general M” is that we can extend these ideas beyond n = 3. 
This discussion regarding points in M n leads into a study of vectors in M n . While we consider 
M n for all n, we will largely focus on n — 2,3 in this section. 

Consider the following definition. 



For this reason we may write both P = (jq, • • • ,p n ) 6 M" and OP = [pi • • • p n \ T £ BC- 
This definition is illustrated in the following picture for the special case of M 3 . 
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P = (PuP2,Ps) 



Thus every point P in M n determines its position vector OP. Conversely, every such 
position vector oP which has its tail at 0 and point at P determines the point P of M n . 

Now suppose we are given two points, P,Q whose coordinates are (j )\ , ■ • • ,p n ) and 
(q \ , • • • ,q n ) respectively. We can also determine the position vector from P to Q (also 
called the vector from P to Q) defined as follows. 




Now, imagine taking a vector in M n and moving it around, always keeping it pointing in 
the same direction as shown in the following picture. 



After moving it around, it is regarded as the same vector. Each vector, OP and An has 
the same length (or magnitude) and direction. Therefore, they are equal. 

Consider now the general definition for a vector in M n . 


Definition 4.2: Vectors in M n 

Let M n = {(xi, • • • , x n ) : Xj 6 1 for j = 1, 

■ ■ ■ , n} . Then, 


X\ 


x = 

X n 


is called a vector. Vectors have both size (magnitude) and direction. The numbers 

Xj are called the components of x. 


J 


Using this notation, we may use p to denote the position vector of point P. Notice that 
in this context, p = 0 P. These notations may be used interchangeably. 
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You can think of the components of a vector as directions for obtaining the vector. 
Consider n — 3. Draw a vector with its tail at the point (0, 0, 0) and its tip at the point 
(. a,b,c ). This vector it is obtained by starting at (0,0,0), moving parallel to the x axis to 
(a, 0, 0) and then from here, moving parallel to the y axis to (a, b , 0) and finally parallel to the 
z axis to (a, b, c) . Observe that the same vector would result if you began at the point (d, e, /), 
moved parallel to the x axis to (d + a, e, /) , then parallel to the y axis to (d + a, e + b, /) , 
and finally parallel to the z axis to (d + a, e + b, / + c). Here, the vector would have its tail 
sitting at the point determined by A — ( d , e, /) and its point at B = (d + a, e + b, / + c) . It 
is the same vector because it will point in the same direction and have the same length. 
It is like you took an actual arrow, and moved it from one location to another keeping it 
pointing the same direction. 

We conclude this section with a brief discussion regarding notation. In previous sections, 
we have written vectors as columns, or n x 1 matrices. For convenience in this chapter we 
may write vectors as the transpose of row vectors, or 1 x n matrices. These are of course 
equivalent and we may move between both notations. Therefore, recognize that 



Notice that two vectors u — [u\ ■ ■ ■u n ] T and v = [iq • • ■v n ] T are equal if and only if all 
corresponding components are equal. Precisely, 

u = v if and only if 
Uj = Vj for all j — 1, • ■ ■ ,n 

Thus [ 1 2 4 ] T 6 K 3 and [ 2 1 4 ] T 6 I 3 but [ 1 2 A ] T ^ [ 2 1 4] T because, 
even though the same numbers are involved, the order of the numbers is different. 

For the specific case of M 3 , there are three special vectors which we often use. They are 
given by 

i = [ 1 0 Of 
3= [ 0 1 0 f 

k =[ 0 0 1 f 

We can write any vector u — [ U\ U 2 U 3 ] T as a linear combination of these vectors, written 
as u = U\i + u 2 j + u 3 k. This notation will be used throughout this chapter. 

4.2 Algebra in R n 


Outcomes 


A. Understand vector addition and scalar multiplication, algebraically. 
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Addition and scalar multiplication are two important algebraic operations done with 
vectors. Notice that these operations apply to vectors in M n , for any value of n. We will 
explore these operations in more detail in the following sections. 

4.2.1. Addition of Vectors in 


Addition of vectors in M n is defined as follows. 


Definition 4.3: Addition of Vectors in 








Ul 


Vi 


Ifu = 

^ n 

, v = 

Vn 

G M n then u + v E R n and is defined by 


u + v = 


U 1 

U n 

Ui + Vi 

U n A V n 


Vi 


To add vectors, we simply add corresponding components exactly as we did for matrices. 
Therefore, in order to add vectors, they must be the same size. 

Similarly to matrices, addition of vectors satisfies some important properties. These are 
outlined in the following theorem. 


145 


Theorem 4.4: Properties of Vector Addition 


The following properties hold for vectors u,v,w G M n . 

• The Commutative Law of Addition 

u + v = v + u 

• The Associative Law of Addition 

(u + v) + w = u + (v + w) 

• The Existence of an Additive Identity 

u + 0 = u ( 4 . 1 ) 

• The Existence of an Additive Inverse 

u + (—u) = 0 


The proof of this theorem follows from the similar theorem given for matrices in Propo- 
sition 2.7. Thus the additive identity shown in equation 4.1 is also called the zero vector, 
the n x 1 vector in which all components are equal to 0. Further, — u is simply the vector 
with all components having same value as those of u but opposite sign; this is just (— l)u. 
This will be made more explicit in the next section when we explore scalar multiplication of 
vectors. Note that subtraction is defined as u — v = u+ (—v) ■ 


4.2.2. Scalar Multiplication of Vectors in R n 


Scalar multiplication of vectors in M" is defined as follows. Notice that, just like addition, 
this definition is the same as the corresponding definition for matrices. 


Definition 4.5: Scalar Multiplication of Vectors in 


If u G M” and k G M is a scalar, then ku G R n is defined by 


ku = k 


Ui 


l 

S • 

Un 


1 

£ 

•• 53 


Just as with addition, scalar multiplication of vectors satisfies several important proper- 
ties. These are outlined in the following theorem. 
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Proof: Again the verification of these properties follows from the corresponding proper- 
ties for scalar multiplication of matrices, given in Proposition 2.10. 

As a refresher we can show that 


k [u + v) = ku + kv 


Note that: 


k [u + v) = k [ui + vi ■ ■ ■ u n + v n \ T 

= [k («! + vi) ■ ■ ■ k (■ u n + v n )] T 
= [kui + kv i • • • ku n + kv n ] T 
= [ kui • • • ku n } + [kv i • • • kv n \ 
= ku + kv 


4.2.3. Exercises 


l. 


2 . 


Find —3 


Find —7 


5 ' 


" -8 ' 

-1 

+ 5 

2 

2 

-3 

_ -3 _ 


6 . 

6 ' 


" -13 

0 

+ 6 

-1 

1 

4 

-1 


6 
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4.3 Geometric Meaning of 
Vector Addition 


Outcomes 


A. Understand vector addition, geometrically. 


Recall that an element of M n is an ordered list of numbers. For the specific case of 
n — 2,3 this can be used to determine a point in two or three dimensional space. This point 
is specified relative to some coordinate axes. 

Consider the case n = 3 . Recall that taking a vector and moving it around without chang- 
ing its length or direction does not change the vector. This is important in the geometric 
representation of vector addition. 

Suppose we have two vectors, u and v in M 3 . Each of these can be drawn geometrically 
by placing the tail of each vector at 0 and its point at (u\, 112, U3) and (vi, V2, V3) respectively. 
Suppose we slide the vector v so that its tail sits at the point of u. We know that this does 
not change the vector v. Now, draw a new vector from the tail of u to the point of v. This 
vector is u + v. 

The geometric significance of vector addition in M n for any n is given in the following 
definition. 



This definition is illustrated in the following picture in which u + v is shown for the special 
case n = 3 . 
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Notice the parallelogram created by u and v in the above diagram. Then u + v is the 
directed diagonal of the parallelogram determined by the two vectors u and v. 

When you have a vector v, its additive inverse — v will be the vector which has the same 
magnitude as v but the opposite direction. When one writes u — v, the meaning is u + (—v) 
as with real numbers. The following example illustrates these definitions and conventions. 


— 

Example 4.8: Graphing Vector Addition 

1 

Consider the following picture of vectors u and v. 


Sketch a picture of u + v, u — v. 



Solution. We will first sketch u + v . Begin by drawing u and then at the point of u, place 
the tail of v as shown. Then u + v is the vector which results from drawing a vector from 
the tail of u to the tip of v. 



Next consider u — v. This means u + (— v) . From the above geometric description of 
vector addition, — v is the vector which has the same length but which points in the opposite 
direction to v. Here is a picture. 
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4.4 Length of a Vector 


□ 


Outcomes 


A. Find the length of a vector and the distance between two points in M n . 

B. Find the corresponding unit vector to a vector in M n . 


In this section, we explore what is meant by the length of a vector in M n . We develop 
this concept by first looking at the distance between two points in M n . 

First, we will consider the concept of distance for M, that is, for points in M 1 . Here, the 
distance between two points P and Q is given by the absolute value of their difference. We 
denote the distance between P and Q by d(P, Q ) which is defined as 

d(P,Q) = \l( p -Q? (4.2) 

Consider now the case for n = 2, demonstrated by the following picture. 

P = (Pl,P2) 


There are two points P = (pi, P 2 ) and Q = (qrqg) in the plane. The distance between 
these points is shown in the picture as a solid line. Notice that this line is the hypotenuse 
of a right triangle which is half of the rectangle shown in dotted lines. We want to find 
the length of this hypotenuse which will give the distance between the two points. Note 
the lengths of the sides of this triangle are \pi — (g | and \p 2 — q 2 \ , the absolute value of the 
difference in these values. Therefore, the Pythagorean Theorem implies the length of the 
hypotenuse (and thus the distance between P and Q) equals 

(bi - <?i| 2 + \P 2 - ^l 2 ) 17 ' = ((pi - qi) 2 + (p 2 - <? 2 ) 2 ) 1/2 (4.3) 
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Now suppose n — 3 and let P = ( pi,p 2 ,p 3 ) and Q = {(p , q 2 , q 3 ) be two points in M 3 . 
Consider the following picture in which the solid line joins the two points and a dotted line 
joins the points (q lt q 2 , q 3 ) and (p 3 ,p 2 , q 3 ) • 



Here, we need to use Pythagorean Theorem twice in order to find the length of the solid 
line. First, by the Pythagorean Theorem, the length of the dotted line joining (q \ , q 2 , q 3 ) and 
{Pi,P2,q 3 ) equals 

((pi - <?i) 2 + i'P2 - g 2 ) 2 ) 1/2 

while the length of the line joining ( Pi,P2,q 3 ) to (pi,P2,p 3 ) is just \p 3 — q 3 \ . Therefore, by 
the Pythagorean Theorem again, the length of the line joining the points P = (pi,P2,p 3 ) 
and Q — (qi, q 2 , q 3 ) equals 


^(((pi -q\f + (P 2 -g 2 ) 2 ) 1/2 ) +(p 3 



1/2 


= ((pi - qif + (P 2 - g 2 ) 2 + (P3 - 9 s) 2 ) 1A (4.4) 

This discussion motivates the following definition for the distance between points in M n . 



From the above discussion, you can see that Definition 4.9 holds for the special cases 
n = 1,2,3, as in Equations 4.2, 4.3, 4.4. In the following example, we use Definition 4.9 to 
find the distance between two points in M 4 . 
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Example 4.10: Distance Between Points 


Find the distance between the points P and Q in M 4 , where P and Q are given by 

P — (1) 2, —4, 6) 

and 

Q = (2,3, -1,0) 


Solution. We will use the formula given in Definition 4.9 to find the distance between P and 
Q. Use the distance formula and write 

d{P, Q ) = ((1 - 2) 2 + (2 - 3) 2 + (-4 - (-1)) 2 + (6 - 0) 2 ) 1 = 47 

Therefore, d(P, Q) = \/47. 

□ 

There are certain properties of the distance between points which are important in our 
study. These are outlined in the following theorem. 



There are many applications of the concept of distance. For instance, given two points, 
we can ask what collection of points are all the same distance between the given points. This 
is explored in the following example. 


Example 4.12: The Plane Between Two Points 


Describe the points in M 3 which are at the same distance between (1, 2, 3) and (0, 1, 2) . 


Solution. Let P = (pi,P 2 ,P 3 ) be such a point. Therefore, P is the same distance from (1, 2, 3) 
and (0, 1, 2) . Then by Definition 4.9, 

\J (Pi ~ !) 2 + (P2 - 2 f + (p 3 - 3) 2 = \J (pi - 0) 2 + (pa - l) 2 + (P3 - 2) 2 
Squaring both sides we obtain 

0 Pi ~ x ) 2 + (P2 - 2) 2 + (p 3 - 3) 2 = p\ + (p 2 - l) 2 + (P3 - 2) 2 

and so 

p\ - 2pi + 14 + pi - Ap 2 +p 2 3 - 6 p 3 = pi + pi - 2p 2 + 5 + p 2 - 4 p 3 
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Simplifying, this becomes 


-2pi + 14 - 4 p 2 - 6 p 3 = -2 p 2 + 5-4 p 3 


which can be written as 

2 pi + 2 p 2 + 2 p 3 = -9 (4.5) 

Therefore, the points P = (p\ , P2,P'i) which are the same distance from each of the given 
points form a plane whose equation is given by 4.5. □ 

We can now use our understanding of the distance between two points to define what is 
meant by the length of a vector. Consider the following definition. 



This definition corresponds to Definition 4.9, if you consider the vector u to have its tail 
at the point 0 = (0, • • • ,0) and its tip at the point U = (wi, • • • , u n ). Then the length of u 
is equal to the distance between 0 and U, d( 0, U). In general, d(P, Q) = ||P(^||. 

Consider Example 4.10. By Definition 4.13, we could also find the distance between P 
and Q as the length of the vector connecting them. Hence, if we were to draw a vector P(b 
with its tail at P and its point at Q, this vector would have length equal to \/47. 

We conclude this section with a new definition for the special case of vectors of length 1. 


Definition 4.14: Unit Vector 



Let v be a vector in W 1 . Then, the vector u which has the same direction as v but length 
equal to 1 is the corresponding unit vector of v. This vector is given by 

1 _ 

u = 


We often use the term normalize to refer to this process. When we normalize a vector, 
we find the corresponding unit vector of length 1. Consider the following example. 


Example 4.15: Finding a Unit Vector 


Let v be given by 

v = [ 1 -3 4 ] T 

Find the unit vector u which has the same direction as v . 
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Solution. We will use Definition 4.14 to solve this. Therefore, we need to find the length of 
v which, by Definition 4.13 is given by 




t-'i + v% + v I 


Using the corresponding values we find that 


u|| = \/ l2 + (-3) 2 + 4 2 
= Vl + 9 + 16 

= V26 


In order to find it, we divide v by \/26. The result is 


u 




-3 


1 3 _ 

U26 V26 



You can verify using the Definition 4.13 that ||w|| = 1. 


4.5 Geometric Meaning 
of Scalar Multiplication 


□ 


Outcomes 


A. Understand scalar multiplication, geometrically. 


Recall that the point P = (pi,P 2 ,P 3 ) determines a vector p from 0 to P. The length of 
p, denoted ||p||, is equal to \/ p\ + p 2 + p\ by Definition 4.9. 

Now suppose we have a vector u, = [ Ui u 2 u 3 ] and we multiply u by a scalar k. By 

Definition 4.5, ku = [ ku\ ku 2 ku 3 ] . Then, by using Definition 4.9, the length of this 
vector is given by 

\J ((kui) 2 + ( ku 2 f + (ku 3 ) 2 ) = \k\ \Ju\ + u\ + u\ 

Thus the following holds. 

||fcu|| = \k\ ||u|| 
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In other words, multiplication by a scalar magnifies or shrinks the length of the vector by a 
factor of \k\. If \k\ > 1, the length of the resulting vector will be magnified. If \k\ < 1, the 
length of the resulting vector will shrink. Remember that by the definition of the absolute 
value, \k\ > 0. 

What about the direction? Draw a picture of u and ku where k is negative. Notice that 
this causes the resulting vector to point in the opposite direction while if k > 0 it preserves 
the direction the vector points. Therefore the direction can either reverse, if k < 0, or remain 
preserved, if k > 0. 

Consider the following example. 


r - ~ ~ ~ 

Example 4.16: Graphing Scalar Multiplication 

1 

Consider the vectors u and v drawn below. 


Draw — u, 2v, and —\v. 



Solution. 

In order to find —u, we preserve the length of u and simply reverse the direction. For 2fT, 
we double the length of v, while preserving the direction. Finally —\v is found by taking 
half the length of v and reversing the direction. These vectors are shown in the following 
diagram. 



□ 

Now that we have studied both vector addition and scalar multiplication, we can combine 
the two actions. Recall Definition 1.32 of linear combinations of column matrices. We can 
apply this definition to vectors in M n . A linear combination of vectors in M n is a sum of 
vectors multiplied by scalars. 

In the following example, we examine the geometric meaning of this concept. 
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Example 4.17: Graphing a Linear Combination of Vectors 


Consider the following picture of the vectors u and v 


Sketch a picture of u + 2v 1 u — \v. 



Solution. The two vectors are shown below. 




□ 


4.6 Parametric Lines 


Outcomes 


A. Find the vector and parametric equations of a line. 


We can use the concept of vectors and points to find equations for arbitrary lines in M n , 
although in this section the focus will be on lines in M 3 . 

To begin, consider the case n = 1 so we have M 1 = M. There is only one line here which 
is the familiar number line, that is R. itself. Therefore it is not necessary to explore the case 
of n — 1 further. 
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Now consider the case where n — 2, in other words M 2 . Let P and P 0 be two different 
points in M 2 which are contained in a line L. Let p and ph be the position vectors for the 
points P and P 0 respectively. Suppose that Q is an arbitrary point on L. Consider the 
following diagram. 



Consider 


Our goal is to be able to define Q in terms of P and Po. Consider the vector 0)/ = p — po 
which has its tail at P 0 and point at P. If we add p — po to the position vector p 0 for P 0 , the 
sum would be a vector with its point at P. In other words, 

P = PO + (p — Po) 

Now suppose we were to add t(p — po) to p where t is some scalar. You can see that by 
doing so, we could find a vector with its point at Q. In other words, we can find t such that 

q =p 0 +t (p-po) 

This equation determines the line L in M 2 . In fact, it determines a line L in 1 
the following definition. 


Definition 4.18: Vector Equation of a Line 


Suppose a line L in M n contains the two different points P and Po. Let p and pi be the 
position vectors of these two points, respectively. Then, L is the collection of points 
Q which have the position vector q given by 

q = Po + t{p-p 0 ) 

where f 6l. 

Let d = p — pi). Then d is the direction vector for L and the vector equation for 
L is given by 

P = Po + td,t £l 


Note that this definition agrees with the usual notion of a line in two dimensions and so 
this is consistent with earlier concepts. Consider now points in M 3 . If a point P G M 3 is 
given by P = (x, y, z), Pq G M 3 by P 0 = (xq, yo, zo), then we can write 


X 

y 

— 

1 

o o 

+ t 

a 

b 

z 


. -0 . 


c 
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a 


where d = 


This is the vector equation of L written in component form 


The following theorem claims that such an equation is in fact a line. 



Proof. Let x[,X2 £ M n . Define x[ = a and let x 2 — x\ = b. Since b ^ 0, it follows that 
x 2 7^ X\. Then a + tb — x\ +t(x 2 — x[). It follows that x = a + tb is a line containing the 
two different points X\ and X 2 whose position vectors are given by x\ and x 2 respectively. □ 

We can use the above discussion to find the equation of a line when given two distinct 
points. Consider the following example. 


Example 4.20: A Line From Two Points 


Find a vector equation for the line through the points Pq — (1, 2, 0) and P = (2, —4, 6) . 


Solution. We will use the definition of a line given above in Definition 4.18 to write this line 
in the form 

Q—Po + t{p—po) 
x 

. Then, we can find p and po by taking the position vectors of points P and 


Let q = 


y 

z 


Po respectively. Then, 
can be written as 


q =p 0 + t (p~Po) 


Here, the direction vector 
above in Definition 4.18. 


X 

y 

— 

' 1 ' 

2 

+ t 

1 ' 

-6 

z 


0 


6 







, t e 


1 ' 


2 ' 


' 1 ' 

-6 

6 

is obtained by p — po = 

-4 

6 

— 

2 

0 


as indicated 

□ 


Notice that in the above example we said that we found “a” vector equation for the 
line, not “the” equation. The reason for this terminology is that there are infinitely many 
different vector equations for the same line. To see this, replace t with another parameter, 
say 3s. Then you obtain a different vector equation for the same line because the same set 
of points is obtained. 


In Example 4.20, the vector given by 


1 

-6 

6 


is the direction vector defined in Definition 


4.18. If we know the direction vector of a line, as well as a point on the line, we can find the 
vector equation. 
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Consider the following example. 


■ ■ ■ ■ 1 

Example 4.21: A Line From a Point and a Direction Vector 

Find a vector equath 

direction vector d = 

m fc 

' 1 ' 

2 

1 

>r the line which contains the point Pq = (1,2,0) and has 


Solution. We will use Definition 4.18 to write this line in the form p — p® + td, t G M. We 
are given the direction vector d. In order to find po, we can use the position vector of the 


, the equation for the line is given by 



1 


X 

point P 0 . This is given by 

2 

0 

. Letting p = 

y 

z 


X 


' 1 ' 


' 1 ' 

y 

z 

— 

2 

0 

+ t 

2 

1 


, t e 


(4.6) 


We sometimes elect to write a line such as the one given in 4.6 in the form 


□ 


x — 1 + t 
y = 2 + 2t where t G 
z = t 


(4.7) 


This set of equations give the same information as 4.6, and is called the parametric equa- 
tion of the line. 

Consider the following definition. 



You can verify that the form discussed following Example 4.21 in equation 4.7 is of the 
form given in Definition 4.22. 
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There is one other form for a line which is useful, which is the symmetric form. Consider 
the line given by 4.7. You can solve for the parameter t to write 


t — x — 1 

+ — y ~ 2 
' 2 

t = z 


Therefore, 


This is the symmetric formof the line. 

In the following example, we look at how to take the equation of a line from symmetric 
form to parametric form. 



Solution. We want to write this line in the form given by Definition 4.22. This is of the form 


x = x 0 + ta j 

y — y 0 + tb > where teR 
z = z 0 + tc J 

Let t = and t — z + 3, as given in the symmetric form of the line. Then 

solving for x, y , z, yields 

x = 2 + 3t 'j 

y = 1 + 2t > with teR 

z = — 3 + t J 

This is the parametric equation for this line. 

Now, we want to write this line in the form given by Definition 4.18. This is the form 


p = Po + td 


where t G R. This equation becomes 


X 


2 ' 


" 3 ' 

y 

z 

— 

1 

-3 

+ t 

2 

1 


□ 
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4.6.1. Exercises 


1. Find the vector equation for the line through (—7,6,0) and (—1, 1,4) . Then, find the 
parametric equations for this line. 

2. Find parametric equations for the line through the point (7, 7, 1) with a direction vector 



3. Parametric equations of the line are 

x = t + 2 
y = 6 — 3f 
z = —t — 6 

Find a direction vector for the line and a point on the line. 

4. Find the vector equation for the line through the two points (—5, 5, 1), (2, 2, 4) . Then, 
find the parametric equations. 

5. The equation of a line in two dimensions is written as y = x — 5. Find parametric 
equations for this line. 

6. Find parametric equations for the line through (6, 5, —2) and (5, 1, 2) . 

7. Find the vector equation and parametric equations for the line through the point 

' 1 ' 

(—7, 10, —6) with a direction vector d — 1 

3 

8. Parametric equations of the line are 

x = 2t + 2 
y = 5 — At 
z = —t — 3 

Find a direction vector for the line and a point on the line, and write the vector 
equation of the line. 

9. Find the vector equation and parametric equations for the line through the two points 
(4,10,0), (1,-5, -6). 

10. Find the point on the line segment from P = (—4, 7, 5) to Q — (2, —2, —3) which is l 
of the way from P to Q. 

11. Suppose a triangle in W l has vertices at P\, P- 2 , and P3. Consider the lines which are 
drawn from a vertex to the mid point of the opposite side. Show these three lines 
intersect in a point and find the coordinates of this point. 
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4.7 The Dot Product 


Outcomes 


A. Compute the dot product of vectors, and use this to compute vector projections. 


4.7.1. The Dot Product 


There are two ways of multiplying vectors which are of great importance in applications. 
The first of these is called the dot product. When we take the dot product of vectors, the 
result is a scalar. For this reason, the dot product is also called the scalar product and 
sometimes the inner product. The definition is as follows. 



The dot product u • v is sometimes denoted as (u,v) where a comma replaces •. It can 
also be written as (u,v). If we write the vectors as column or row matrices, it is equal to 
the matrix product vw T . 

Consider the following example. 


Example 4.25: Compute a Dot Product 


Find u • v for 



1 ' 


‘ 0 ' 

u = 

2 

,v = 

1 

0 

2 


-1 


3 


Solution. By Definition 4.24, we must compute 

4 

u • V = ^2 U k V k 
k = 1 
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This is given by 


u.v = (l)(0) + (2)(l) + (0)(2) + (-l)(3) 

= 0+2+0+ -3 
= -1 

□ 

With this definition, there are several important properties satisfied by the dot product. 



The proof of this proposition is left as an exercise. 

This proposition tells us that we can also use the dot product to find the length of a 
vector. 



Solution. By Proposition 4.26, ||u|| 2 = u • u. Therefore, ||u|| = \Ju*u. First, compute u*u. 
This is given by 

u.u = (2)(2) + (1)(1) + (4) (4) + (2)(2) 

= 4 + 1 + 16 + 4 
= 25 
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Then, 


u I 


\J u • u 

V25 

5 


□ 

You may wish to compare this to our previous definition of length, given in Definition 
4.13. 

The Cauchy Schwarz inequality is a fundamental inequality satisfied by the dot prod- 
uct. It is given in the following theorem. 



Proof. First note that if v = 0 both sides of 4.8 equal zero and so the inequality holds in 
this case. Therefore, it will be assumed in what follows that v 7^ 0. 

Define a function of t G M by 


/ (t) = [u + tv) • (u + tv) 

Then by Proposition 4.26, f (t) > 0 for all teK. Also from Proposition 4.26 

f (t) = u* (u + tv) + tv • (u + tv) 

= u*u + t(u»v) + tv»u + t 2 v • V 
= \\u\\ 2 + 2t (u • v) + \\v\\ 2 t 2 

Now this means the graph of y = f ( t ) is a parabola which opens up and either its vertex 
touches the t axis or else the entire graph is above the t axis. In the first case, there exists 
some t where f (t) = 0 and this requires u + tv = 0 so one vector is a multiple of the other. 
Then clearly equality holds in 4.8. In the case where v is not a multiple of u, it follows 
f (t) > 0 for all t which says / (t) has no real zeros and so from the quadratic formula, 

(2 (u • v)) 2 — 4||n|| 2 ||n|| 2 < 0 

which is equivalent to \u • v\ < ||w|| ||n||. □ 

Notice that this proof was based only on the properties of the dot product listed in 
Proposition 4.26. This means that whenever an operation satisfies these properties, the 
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Cauchy Schwarz inequality holds. There are many other instances of these properties besides 
vectors in M n . 

The Cauchy Schwarz inequality provides another proof of the triangle inequality for 
distances in M n . 


r ■ 

Theorem 4.29: Triangle Inequality 


1 

For u, v E R n 

||n + n|| < 

INI + M 

(4.9) 

and equality holds if and only if one of the vectors is a non ■ 

-negative scalar multiple of 

the other. 

Also 

1 

VI 

(4.10) 


Proof. By properties of the dot product and the Cauchy Schwarz inequality, 


\u + v\\ = 


< 

< 


(u + v) • [u + v) 

(u • u) + (u • v) + (v • u) + (v • v) 

\\u\\ 2 + 2 (u • v) + ||n|| 2 

||u|| 2 + 2 \u • v\ + ||n|| 2 

||m|| 2 + 2||u|| ||n|| + ||n|| 2 = (||w|| + || 


2 


Hence, 

|| u + n|| 2 < (||w|| + ||n||) 2 

Taking square roots of both sides you obtain 4.9. 

It remains to consider when equality occurs. Suppose u — 0. Then, u — Ov and the claim 
about when equality occurs is verified. The same argument holds if v — 0. Therefore, it can 
be assumed both vectors are nonzero. To get equality in 4.9 above, Theorem 4.28 implies 
one of the vectors must be a multiple of the other. Say v = ku. If k < 0 then equality cannot 
occur in 4.9 because in this case 


u»v = k\\u\\ 2 < 0 < \k\ ||n| 


u • v | 


Therefore, k > 0. 

To get the other form of the triangle inequality write 

u = u — v + v 
so 

||w|| = ||u — v + n|| 

< \\u — n|| + ||n|| 

Therefore, 

||ft|| — ||ff|| < ||w — n|| (4-11) 
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Similarly, 


(4.12) 

equals the left side 


||u|| — ||«|| < ||u — u || = ||u — u|| 

It follows from 4.11 and 4.12 that 4.10 holds. This is because |||w|| — ||u| 
of either 4.11 or 4.12 and either way, |||w|| — ||u||| < \\u — u||. □ 


4.7.2. The Geometric Significance of the Dot Product 


Given two vectors, u and v, the included angle is the angle between these two vectors which 
is less than or equal to 180 degrees. The dot product can be used to determine the included 
angle between two vectors. Consider the following picture where 6 gives the included angle. 




In words, the dot product of two vectors equals the product of the magnitude (or length) 
of the two vectors multiplied by the cosine of the included angle. Note this gives a geometric 
description of the dot product which does not depend explicitly on the coordinates of the 
vectors. 

Consider the following example. 


Example 4.31: Find the Angle Between Two Vectors 


Find the angle between the vectors given by 



2 ' 


' 3 ' 

u = 

1 

,v = 

4 


-1 


1 


Solution. By Proposition 4.30, 


u • v = \u\ 


cos 6 


Hence, 


cos 9 


u • v 
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First, we can compute u • v. By Definition 4.24, this equals 

u»v — (2) (3) + (1) (4) + ( — 1) (1) = 9 


Then, 

INI = y /(2)(2) + (!)(!) + (!)(!) = V6 
INI = x/(3)(3) + (4)(4) + (l)(l) = V26 

Therefore, the cosine of the included angle equals 

cos 9 = J? - = 0.7205766... 

V26VQ 

With the cosine known, the angle can be determined by computing the inverse cosine of 
that angle, giving approximately 9 = 0.76616 radians. □ 


Another application of the geometric description of the dot product is in finding the angle 
between two lines. Typically one would assume that the lines intersect. In some situations, 
however, it may make sense to ask this question when the lines do not intersect, such as the 
angle between two object trajectories. In any case we understand it to mean the smallest 
angle between (any of) their direction vectors. The only subtlety here is that if u is a direction 
vector for a line, then so is any multiple ku , and thus we will find complementary angles 
among all angles between direction vectors for two lines, and we simply take the smaller of 
the two. 


Example 4.32: Find the Angle Between Two Lines 


Find the angle between the two lines 

L x : 

and 


X 


' 1 ' 


" -1 " 

y 

z 

— 

2 

0 

+ t 

1 

2 



X 


0 


2 

L 2 : 

y 

z 


4 

-3 

+ s 

1 

-1 


Solution. You can verify that these lines do not intersect, but as discussed above this does 
not matter and we simply find the smallest angle between any directions vectors for these 
lines. 

To do so we first find the angle between the direction vectors given above: 



" -1 ' 


2 ' 

u = 

1 

, v = 

1 


2 


-1 


In order to find the angle, we solve the following equation for 9 


u • v = 



nil cos 9 
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to obtain cos 9 — — | and since we choose included angles between 0 and tt we obtain 9 = =?-. 

Now the angles between any two direction vectors for these lines will either be 4^ or its 
complement 0 = n — 4^ = |. We choose the smaller angle, and therefore conclude that the 
angle between the two lines is □ 

We can also use Proposition 4.30 to compute the dot product of two vectors. 



Solution. From the geometric description of the dot product in Proposition 4.30 

u • v = (3)(4) cos (7t/3) = 3 x 4 x 1/2 = 6 


□ 

Two nonzero vectors are said to be perpendicular, sometimes also called orthogonal, 
if the included angle is 7t/2 radians (90°). 

Consider the following proposition. 



Proof. This follows directly from Proposition 4.30. First if the dot product of two nonzero 
vectors is equal to 0, this tells us that cos 6 = 0 (this is where we need nonzero vectors). 
Thus 9 = 7t/2 and the vectors are perpendicular. 

If on the other hand v is perpendicular to u, then the included angle is 7t/2 radians. 
Hence cos 9 = 0 and u • v = 0. □ 

Consider the following example. 


Example 4.35: Determine if Two Vectors are Perpendicular 


Determine whether the two vectors, 



2 ' 


' 1 ' 

u = 

1 

,v = 

3 


-1 


5 


are perpendicular. 


168 



Solution. In order to determine if these two vectors are perpendicular, we compute the dot 
product. This is given by 


u»v= (2)(1) + (1)(3) + (— 1)(5) = 0 


Therefore, by Proposition 4.34 these two vectors are perpendicular. 


□ 


4.7.3. Projections 


In some applications, we wish to write a vector as a sum of two related vectors. Through 
the concept of projections, we can find these two vectors. First, we explore an important 
theorem. The result of this theorem will provide our definition of a vector projection. 



Proof. Suppose 4.13 holds and h] = ku. Taking the dot product of both sides of 4.13 with 
u and using v± • u = 0, this yields 

v • u — (uj| + hj_) • u 

= ku • u + v± • u 
= k\\u\\ 2 


which requires k = v • u/||'u|| 2 . Thus there can be no more than one vector hj|. It follows v± 
must equal v — m. This verifies there can be no more than one choice for both hj| and v± 
and proves their uniqueness. 

Now let 


v • u _ 



and let 


Then v\\ = ku where k 


_ _ _ v • u _ 

v± = v — V\\ = v — .. „ u 
\\u\\ 2 

jpp-. It only remains to verify v± • u = 0. But 


Vj_ • u 


_ _ V * u _ 

V • u — -U • u 
\\u r 


V • u — V • u 

0 


□ 
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The vector hj| in Theorem 4.36 is called the projection of v onto u and is denoted by 

= proj a (v) 

We now make a formal definition of the vector projection. 



Consider the following example of a projection. 


Example 4.38: Find the Projection of One Vector Onto Another 


Find proj a ( v ) if 



Solution. We can use the formula provided in Definition 4.37 to find proj^ ( v ). First, compute 
v • u. This is given by 


1 

-2 

1 


2 

3 

-4 


(2)(l) + (3)(— 2) + (— 4)(1) 
2-6-4 


Similarly, u • u is given by 

2 

3 

-4 


2 

3 

-4 


Therefore, the projection is equal to 


PWD (v) 


= (2) (2) + (3) (3) + (—4) (—4) 

= 4 + 9 + 16 
= 29 



r —in i 

29 

24 

29 

32 

29 
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□ 


We will conclude this section with an important application of projections. Suppose a 
line L and a point P are given such that P is not contained in L. Through the use of 
projections, we can determine the shortest distance from P to L. 



Solution. In order to determine the shortest distance from P to L, we will first find the 
vector PqP and then find the projection of this vector onto L. The vector F\yP is given by 


" 1 ' 


1 

o 

1 


1 

1 

3 

— 

4 

= 

-1 

1 

lO 


1 

CM 

! 


1 


Then, if Q is the point on L closest to P, it follows that 



projy/V^ 



15 

~9~ 


2 

1 

2 


5 

3 


2 

1 

2 


Now, the distance from P to L is given by 

||QP|| = \\P^P-P^\\ = V26 


The point Q is found by adding the vector P 0 Q to the position vector 0 P 0 for P 0 as 
follows 


o ' 

5 

" 2 ' 

2 

5 ' 

4 

+ — 

1 


10 

-2 

3 

2 

3 

2 


10 -I 

3 

20 

3 

4 
3 
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Therefore, Q = (y, y, |). 


□ 


4.7.4. Exercises 


2 

0 

1 ' 

3 _ 

2. Use the formula given in Proposition 4.30 to verify the Cauchy Schwarz inequality and 
to show that equality occurs if and only if one of the vectors is a scalar multiple of the 
other. 

3. For u,v vectors in M 3 , define the product, u * v = u\V\ + 2-u 2 ^2 + Show the 

axioms for a dot product all hold for this product. Prove 

II u * P|| < (u* u) 1/2 ( v * v) 1/2 

4. Let a, b be vectors. Show that ^a»b^j — \ ^||a + 6|| 2 — ||a — 6|| 2 j . 

5. Using the axioms of the dot product, prove the parallelogram identity: 

||a + 6|| 2 + || a - 6|| 2 = 2 || a || 2 + 2 || 6|| 2 


1. Find 


1 

2 

3 

4 


6. Let A be a real m x n matrix and let u G and v G R m . Show Au • v = u • A T v. 
Hint: Use the definition of matrix multiplication to do this. 

7. Use the result of Problem 6 to verify directly that ( AB) T = B T A T without making 
any reference to subscripts. 

8. Find the angle between the vectors 



3 ' 


' 1 ' 

u = 

-1 

,v = 

4 


-1 


2 


9. Find the angle between the vectors 



1 ' 


1 ' 

u = 

-2 

,v = 

2 


1 


-7 



1 ' 


' 1 ' 

10. Find proj^(uJ) where iu = 

1 

O CM 

I 

and v = 

1 

CM CO 
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1 ' 


' 1 ' 

11. Find proj- (h;) where w = 

2 

and v = 

0 


-2 


3 


1 ' 


' 1 ' 

12. Find proj^ (w) where w = 

2 

-2 

and v = 

2 

3 


1 


0 


13. Let P = (1,2,3) be a point in M 3 . Let L be the line through the point P 0 = (1,4,5) 


with direction vector d = 


1 

-1 

1 


Find the shortest distance from P to L, and find 


the point Q on L that is closest to P. 

14. Let P = (0, 2, 1) be a point in M 3 . Let L be the line through the point P 0 = (1, 1, 1) 


with direction vector d = 


3 

0 

1 


Find the shortest distance from P to L, and find the 


point Q on L that is closest to P . 

15. Does it make sense to speak of projg (w)? 

16. Prove the Cauchy Schwarz inequality in M n as follows. For u,v vectors, consider 

( w — projyuJ) • (w — projyul) > 0 


Simplify using the axioms of the dot product and then put in the formula for the 
projection. Notice that this expression equals 0 and you get equality in the Cauchy 
Schwarz inequality if and only if w — projyuJ. What is the geometric meaning of 
w = proj yuJ? 


17. Let v,w u be vectors. Show that (w + u) ± = w± + u±_ where w± = w — proj^ ( w ) . 

18. Show that 


(v - P r °J« (v) ,u) = (v- proj ff (v ) , u) = (v - proj^ (v)) • u = 0 

and conclude every vector in M n can be written as the sum of two vectors, one which 
is perpendicular and one which is parallel to the given vector. 
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4.8 Planes in M n 


Outcomes 


A. Find the vector and scalar equations of a plane. 


Much like the above discussion with lines, vectors can be used to determine planes in M n . 
Given a vector n in and a point P 0 , it is possible to find a unique plane which contains 
Po and is perpendicular to the given vector. 



In other words, we say that n is orthogonal (perpendicular) to every vector in the plane. 

Consider now a plane with normal vector given by n, and containing a point Po. Notice 
that this plane is unique. If P is an arbitrary point on this plane, then by definition the 
normal vector is orthogonal to the vector between Po and P. Letting 0 P and OPo be the 
position vectors of points P and P 0 respectively, it follows that 



= 0 



The first of these equations gives the vector equation of the plane. 



Notice that this equation can be used to determine if a point P is contained in a certain 
plane. 
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Example 4.42: A Point in a Plane 


Let n = 


1 

2 

3 


be the normal vector for a plane which contains the point P 0 = (2,1,4). 


Determine if the point P = (5, 4, 1) is contained in this plane. 


Solution. By Definition 4.41, P is a point in the plane if it satisfies the equation 

ft • (oP — OTo) = 0 

Given the above n, P 0 , and P, this equation becomes 


" 1 ' 

l 

' 5 ' 


" 2 ' 

\ 

1 

1 

( 

1 

CO 

1 

\ 

2 

• 

4 

— 

1 

= 

2 

• 

CO 


1 

CO 

1 

V 

1 


1 

) 

CO 

V 

1 

CO 

/ 


= 3 + 6 — 9 = 0 

Therefore P = (5, 4, 1) is contained in the plane. 


Suppose n = 
Then 


a 

b 

c 


,P=(x, V , z) and P 0 = (x 0 , y 0 , z 0 ). 


n • 


a 

( 

X 


x 0 

b 

• 

y 

— 

Vo 

c 

V 

z 


. z ° . 


(Op-0?o) = 0 
= 0 


a 

b 

c 


x — x 0 

2/ — 2/o 

z- z 0 


= 0 


□ 


a(x - x 0 ) + b(y - y 0 ) + c(z - z 0 ) = 0 
We can also write this equation as 

ax + by + cz = ax o + byo + czo 

Notice that since Po is given, ax o + byo + czo is a known scalar, which we can call d. This 
equation becomes 

ax + by + cz = d 
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Definition 4.43: Scalar Equation of a Plane 


Let n = 


a 

b 

c 


be the normal vector for a plane which contains the point P 0 = 


(xo, yo, Zo).Then if P = ( x,y,z ) is an arbitrary point on the plane, the scalar equa- 
tion of the plane is given by 

ax + by + cz = d 

where a, b, c, d G M and d = ax 0 + by 0 + cz 0 . 


Consider the following equation. 



Solution. The above vector n is the normal vector for this plane. Using Definition 4.41, we 
can determine the vector equation for this plane. 


n 


0>-of o 


' -2 ' 

/ 

( 

X 


3 " 

4 

• 


y 

— 

-2 

1 


V 

z 


5 


' 

-2 ' 


x — 3 


4 

• 

y + 2 


1 


z — 5 


0 

0 

0 


Using Definition 4.43, we can determine the scalar equation of the plane. 


-2x + 4 y + lz = -2(3) + 4(-2) + 1(5) = -9 


Hence, the vector equation of the plane is 


" -2 ' 


x — 3 

4 

• 

y + 2 

1 


z — 5 


and the scalar equation is 

— 2x + Ay + lz = — 9 


□ 


Suppose a point P is not contained in a given plane. We are then interested in the 
shortest distance from that point P to the given plane. Consider the following example. 
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Example 4.45: Shortest Distance From a Point to a Plane 


Find the shortest distance from the point P = (3, 2, 3) to the plane given by 
2x + y + 2z = 2, and find the point Q on the plane that is closest to P. 


Solution. Pick an arbitrary point P 0 on the plane. Then, it follows that 

Qp = proj H P^P 

and HQi^H is the shortest distance from P to the plane. Further, the vector 0 Q = oP — Qp 
gives the necessary point Q. 

~ 2 

From the above scalar equation, we have that n = 


1 

2 


Now, choose Pq = (1,0,0) so 



1 

CO 


1 

1 


1 

CN 

= 

2 

— 

0 

= 

2 


CO 


0 


3 


P r oj hPoP 
PnP • n 


n 


\n\ 


12 

¥ 

4 

3 


2 

1 

2 


2 

1 

2 


Then, ||<3P|| = 4 so the shortest distance from P to the plane is 4. 
Next, to find the point Q on the plane which is closest to P we have 


53 = 0i^ 


3 

2 

3 


4 

3 


2 

1 

2 


1 

3 


1 

2 

1 


Therefore, Q = (|, |, |) 


□ 
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4.9 The Cross Product 


Outcomes 


A. Compute the cross product and box product of vectors in M 3 . 


Recall that the dot product is one of two important products for vectors. The second 
type of product for vectors is called the cross product. It is important to note that the 
cross product is only defined in M 3 . First we discuss the geometric meaning and then a 
description in terms of coordinates is given, both of which are important. The geometric 
description is essential in order to understand the applications to physics and geometry while 
the coordinate description is necessary to compute the cross product. 

Consider the following definition. 


Definition 4.46: Right Hand System of Vectors 


Three vectors, u, v, w form a right hand system if when you extend the hngers of your 
right hand along the direction of vector u and close them in the direction of v, the 
thumb points roughly in the direction of w. 


For an example of a right handed system of vectors, see the following picture. 


u 


w 



In this picture the vector w points upwards from the plane determined by the other two 
vectors. Point the fingers of your right hand along u, and close them in the direction of v. 
Notice that if you extend the thumb on your right hand, it points in the direction of w. 

You should consider how a right hand system would differ from a left hand system. Try 
using your left hand and you will see that the vector w would need to point in the opposite 
direction. 

Notice that the special vectors, i,j,k will always form a right handed system. If you 
extend the fingers of your right hand along i and close them in the direction j, the thumb 
points in the direction of k. 
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k 


j 
i 

The following is the geometric description of the cross product. Recall that the dot 
product of two vectors results in a scalar. In contrast, the cross product results in a vector, 
as the product gives a direction as well as magnitude. 



The cross product of the special vectors i,j, k is as follows. 

i x j = k j x i = —k 
k x i = j i x k = — j 
j x k = i k x j = —i 

With this information, the following gives the coordinate description of the cross product. 

Recall that the vector u — [ u\ U2 U3 ] can be written in terms of i,j,k as u = 
u\i + u 2 j + u 3 k. 




Writing u X v in the usual way, it is given by 


u x v = 


U2V3 ~ U3V2 
- 113 V1) 
U1V2 - U 2 V\ 
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We now prove this proposition. 

Proof. From the above table and the properties of the cross product listed, 


u x v 


□ 


Uii + u 2 j + 


X Vi i + v 2 ] + 


u\v 2 i x j + uiv^i x k + u 2 v\j xi + u 2 v 3 j x k + +u 3 v\k x i + u 3 v 2 k x j 
uiv 2 k - u x v ?J j - u 2 v x k + u 2 v 3 i + u^Vxj - u 3 v 2 i 
(u 2 v 3 - M3U2) i + (u^Vi - U1V3) j + (uiv 2 - u 2 v 1) k 


(4.15) 


There is another version of 4.14 which may be easier to remember. We can express the 
cross product as the determinant of a matrix, as follows. 


i 

J 

k 

Ml 

u 2 

M 3 

Ml 

M 2 

M3 


(4.16) 


Expanding the determinant along the top row yields 


■ 1 ) 


1+1 


u 2 u 3 
V 2 V 3 


+J(-1) 


2+1 


U 1 u 3 
Vi V 3 


+ k{- 1) 


3+1 


Ml U 2 
Ml V 2 


= l 


U 2 M 3 

m 2 M3 

Expanding these determinants leads to 


Mi m 3 
Ml m 3 




Mi M 2 
Ml M 2 


(m 2 m 3 - M3M2) i - (M1M3 - M3M1) j + (M1M2 - M 2 Mi) k 


which is the same as 4.15. 

The cross product satisfies the following properties. 
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Proof. Formula 1. follows immediately from the definition. The vectors uxv and v xu have 
the same magnitude, |w| |iT| sin#, and an application of the right hand rule shows they have 
opposite direction. 

Formula 2. is proven as follows. If A; is a non-negative scalar, the direction of (ku) x v 
is the same as the direction of u x v, k (u x v) and u x (kv). The magnitude is k times the 
magnitude of uxv which is the same as the magnitude of k (u x v) and u x (kv) . Using 
this yields equality in 2. In the case where k < 0, everything works the same way except 
the vectors are all pointing in the opposite direction and you must multiply by \k\ when 
comparing their magnitudes. 

The distributive laws, 3. and 4., are much harder to establish. For now, it suffices to 
notice that if we know that 3. is true, 4. follows. Thus, assuming 3., and using 1., 

(v + w) x u = —u x (v + w) 

= — (uxv + uxw) 

= vxu + w xu 


□ 


We will now look at an example of how to compute a cross product. 

Find uxv for the following vectors 



1 ' 


3 ' 

u = 

-1 

,v = 

-2 


2 


1 


Solution. Note that we can write u,v in terms of the special vectors i, j, k as 

u = i — j + 2k 
v — 3i — 2 j + k 

We will use the equation given by 4.16 to compute the cross product. 



i j k 


-1 2 


1 2 


1 -1 

X V = 

1-12 

3-2 1 


-2 1 

i — 

3 1 

J + 

3 -2 


k = 3i + 5j + k 


We can write this result in the usual way, as 


u x v = 


3 

5 

1 


□ 


An important geometrical application of the cross product is as follows. The size of the 
cross product, ||u x n||, is the area of the parallelogram determined by u and v, as shown in 
the following picture. 
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We examine this concept in the following example. 


Example 4.51: Area of a Parallelogram 


Find the area of the parallelogram determined by the vectors u and v given by 



1 ' 


3 ' 

u = 

-1 

,v = 

-2 


2 


1 


Solution. Notice that these vectors are the same as the ones given in Example 4.50. Recall 
from the geometric description of the cross product, that the area of the parallelogram is 
simply the magnitude of u x v. From Example 4.50, u x v = 3i + 5j + k. We can also write 
this as 


u x v = 


3 

5 

1 


Thus the area of the parallelogram is 


uxv || = \J (3)(3) + (5)(5) + (1)(1) = V9 + 25 + 1 = 


□ 


We can also use this concept to find the area of a triangle. Consider the following example. 


Example 4.52: Area of Triangle 


Find the area of the triangle determined by the points (1, 2, 3) , (0, 2, 5) , (5, 1, 2) 


Solution. This triangle is obtained by connecting the three points with lines. Picking (1, 2, 3) 
as a starting point, there are two displacement vectors, [ — 1 0 2 ] and [4 —1 — 1 ] . 
Notice that if we add either of these vectors to the position vector of the starting point, the 
result is the position vectors of the other two points. Now, the area of the triangle is half the 
area of the parallelogram determined by [ — 1 0 2 ] and [4 —1 — 1 ] . The required 

cross product is given by 


" -1 ' 


4 ' 

0 

X 

-1 

2 


-1 
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Taking the size of this vector gives the area of the parallelogram, given by 


v / (2)(2) + (T)(7) + (1)(1) = V4 + 49 + 1 = a/54 
Hence the area of the triangle is |\/54 = |\/6. □ 

In general, if you have three points in M 3 , P , Q, R , the area of the triangle is given by 



Recall that PQ is the vector running from point P to point 0. 


Q 



R 

In the next section, we explore another application of the cross product. 

4.9.1. The Box Product 


Recall that we can use the cross product to find the the area of a parallelogram. It follows 
that we can use the cross product together with the dot product to find the volume of a 
parallelepiped. 

We begin with a definition. 



The following is an example of a parallelepiped. 
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Notice that the base of the parallelepiped is the parallelogram determined by the vec- 
tors u and v. Therefore, its area is equal to \\u x u||. The height of the parallelepiped is 
\\w\\ cos 9 where 6 is the angle shown in the picture between w and u x v. The volume of 
this parallelepiped is the area of the base times the height which is just 

||m x hj|||w|| cos 9 = u x v • w 

This expression is known as the box product and is sometimes written as [u, v, w\ . You 
should consider what happens if you interchange the v with the w or the u with the w. You 
can see geometrically from drawing pictures that this merely introduces a minus sign. In any 
case the box product of three vectors always equals either the volume of the parallelepiped 
determined by the three vectors or else —1 times this volume. 



Consider an example of this concept. 


Example 4.55: Volume of a Parallelepiped 


Find the volume of the parallelepiped determined by the vectors 



1 ' 


1 ' 


" 3 ' 

u = 

2 

,v = 

3 

,w = 

2 


-5 


-6 


3 


Solution. According to the above discussion, pick any two of these vectors, take the cross 
product and then take the dot product of this with the third of these vectors. The result 
will be either the desired volume or —1 times the desired volume. Therefore by taking the 
absolute value of the result, we obtain the volume. 

We will take the cross product of u and v. This is given by 



1 ' 


1 ' 

U X V — 

2 

X 

3 


-5 


-6 


i j k 


' 3 " 

12-5 

1 3 -6 

= 3 i + j + k = 

1 

1 
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Now take the dot product of this vector with w which yields 


(u x v) •w 



3 

2 

3 


^3i + j + k j • ^3?’ + 2 j + 3 Icj 
9 + 2 + 3 
14 


This shows the volume of this parallelepiped is 14 cubic units. □ 

There is a fundamental observation which comes directly from the geometric definitions 
of the cross product and the dot product. 



Proof. This follows from observing that either [u x v) • w and u • (v X w) both give the 
volume of the parallelepiped or they both give —1 times the volume. □ 


Recall that we can express the cross product as the determinant of a particular matrix. 
It turns out that the same can be done for the box product. Suppose you have three vectors, 
u — [ a b c] , v — [ d e /] , and w = [ g h i ] . Then the box product u»v x w 
is given by the following. 


u • v x VJ 


a 


i j k 

b 

• 

d e f 

c 


9 h i 


e / 
h i 



d f 
9 i 


a b c 


det 


d e f 
9 h i 


+ c 


d 

9 


e 

h 


To take the box product, you can simply take the determinant of the matrix which results 
by letting the rows be the components of the given vectors in the order in which they occur 
in the box product. 

This follows directly from the definition of the cross product given above and the way we 
expand determinants. Thus the volume of a parallelepiped determined by the vectors u, v, w 
is just the absolute value of the above determinant. 


4.9.2. Exercises 

1. Show that if a x u = 0 for any unit vector u, then a = 0. 
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2. Find the area of the triangle determined by the three points, (1, 2, 3) , (4, 2, 0) and 
(-3,2,1). 

3. Find the area of the triangle determined by the three points, (1, 0, 3) , (4, 1, 0) and 
(-3,1,1). 

4. Find the area of the triangle determined by the three points, (1, 2, 3) , (2, 3, 4) and 
(3, 4, 5) . Did something interesting happen here? What does it mean geometrically? 



' 1 ' 


3 ' 

5. Find the area of the parallelogram determined by the vectors 

2 

) 

-2 


3 


1 


' 1 ' 


4 ' 

6. Find the area of the parallelogram determined by the vectors 

0 

5 

-2 


3 


1 


7. Is u x (v x w) = (u x v) x w? What is the meaning of u x v x w? Explain. Hint: Try 

x j j x k. 

8. Verify directly that the coordinate description of the cross product, u x v has the 
property that it is perpendicular to both u and v. Then show by direct computation 
that this coordinate description satisfies 

ii—* — *ii 2 1 1 — *i 1 2 1 1 — *i 1 2 /— * —»\ 2 

||u X u|| = ||w|| ||u|| — [U • V) 

= ll“ll 2 ll^ll 2 (l — cos2 (^)) 

where 9 is the angle included between the two vectors. Explain why || u x u|| has the 
correct magnitude. 

9. Suppose A is a 3 x 3 skew symmetric matrix such that A T = —A. Show there exists a 
vector such that for all u e M 3 

Au = D x u 


Hint: Explain why, since A is skew symmetric it is of the form 


A = 


0 

—UJ 3 

U>2 

CU3 

0 

-CUi 

—U>2 

wi 

0 


where the c Oi are numbers. Then consider c 0 \i + U 2 ] + UJ 3 k. 


10. Find the volume of the parallelepiped determined by the vectors 


1 ' 


" 3 ' 

-2 

-6 

, and 

2 

3 


1 

-7 

-5 
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11. Suppose u,v, and w are three vectors whose components are all integers. Can you 
conclude the volume of the parallelepiped determined from these three vectors will 
always be an integer? 

12. What does it mean geometrically if the box product of three vectors gives zero? 

13. Using Problem 12, find an equation of a plane containing the two position vectors, p 
and q and the point 0. Hint: If ( x,y,z ) is a point on this plane, the volume of the 
parallelepiped determined by (x, y, z ) and the vectors p, q equals 0. 

14. Using the notion of the box product yielding either plus or minus the volume of the 
parallelepiped determined by the given three vectors, show that 

(u x v) • w = u • (v x w) 

In other words, the dot and the cross can be switched as long as the order of the vectors 
remains the same. Hint: There are two ways to do this, by the coordinate description 
of the dot and cross product and by geometric reasoning. 

15. Simplify (u x v) • (v x w) x (w x z) . 

16. Simplify \\u x u|| 2 + [u • v ) 2 — ||'u|| 2 ||u|| 2 . 

17. For u,v,w functions of t, prove the following product rules: 

(u x v)' = u' x v + u x v* 

(u • v)' = v! • v + u • if 

4.10 Spanning, Linear Independence and Ba- 
sis in R n 


Outcomes 


A. Determine the span of a set of vectors, and determine if a vector is contained in 
a specified span. 

B. Determine if a set of vectors is linearly independent. 

C. Understand the concepts of subspace, basis, and dimension. 

D. Find the row space, column space, and null space of a matrix. 
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By generating all linear combinations of a set of vectors one can obtain various subsets 
of M n which we call subspaces. For example what set of vectors in M 3 generate the XY - 
plane? What is the smallest such set of vectors can you find? The tools of spanning, linear 
independence and basis are exactly what is needed to answer these and similar questions 
and are the focus of this section. 

4.10.1. Spanning Set of Vectors 


We begin this section with a definition. 


Definition 4.57: Span of a Set of Vectors 


The collection of all linear combinations of a set of vectors {fti, • • • , Uk} in M n is known 
as the span of these vectors and is written as span{Hi, • • • , u^}- 


Consider the following example. 



Solution. You can see that any linear combination of the vectors u and v yields a vector 
[ x y 0 ] in the XY-plane. 

Moreover every vector in the XY-plane is in fact such a linear combination of the vectors 
u and v. That’s because 


X 


' 1 ' 


" 3 " 

y 

= (-2x + 3 y) 

1 

+ (x-y) 

2 

0 


0 


0 


An arbitrary vector in the XY -plane can be written as a linear combination of u and v. 
Thus span{u, v} is precisely the XY-plane. □ 

You can convince yourself that no single vector can span the XY-plane. In fact, take a 
moment to consider what is meant by the span of a single vector. 

However you can make the set larger if you wish. For example consider the larger set of 
vectors {u, v,w} where fu = [ 4 5 0 ] . Since the first two vectors already span the entire 
XY-plane, the span is once again precisely the XY-plane and nothing has been gained. Of 
course if you add a new vector such as fu = [ 0 0 1 ] then it does span a different space. 
What is the span of u, v, w in this case? 

The distinction between the sets {u, n} and {u, v, fu} will be made using the concept of 
linear independence. 

Consider the vectors u, v, and w discussed above. In the next example, we will show how 
to formally demonstrate that w is in the span of u and v. 
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Example 4.59: Vector in a Span 


Let u — [ 1 1 0 f and v — [ 3 2 0 ] T e M 3 . Show that w — [ 4 5 0 ] T is in 
span {u, if}. 


Solution. For a vector to be in span {it, if}, it must be a linear combination of these vectors. 
If w G span {u, if}, we must be able to find scalars a, b such that 

w = au + bv 

We proceed as follows. 


' 4 ' 


' 1 ' 


' 3 ' 

5 

= a 

1 

+ b 

2 

0 


0 


0 


This is equivalent to the following system of equations 

a + 3b = 4 
a + 2b = 5 


We solving this system the usual way, constructing the augmented matrix and row re- 
ducing to find the reduced row-echelon form. 


13 4 
1 2 5 




1 0 7 

0 1 -1 


The solution is a = 7, b = — 1. This means that 


w = 7u — v 


Therefore we can say that iu is in span {if, n}. 


□ 


4.10.2. Linearly Independent Set of Vectors 


Together with the notion of a spanning set of vectors, linear independence is a very important 
property of a set of vectors. 


Definition 4.60: Linearly Independent Set of Vectors 


A set of non- zero vectors {iq, • • ■ , Up . } in R" is said to be linearly independent if 
no vector in that set is in the span of the other vectors of that set. 


Here is an example. 
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Example 4.61: Linearly Independent Vectors 


Consider the vectors u — [ 1 1 0 ] T , u = [ 3 2 0 ] T , and w — [ 4 5 0 ] T in M 3 . 
Verify whether the set {u, v, iu} is linearly independent. 


Solution. We already verified in Example 4.89 that iu G span{w, u}. Therefore the set 
{u, v, w} is not linearly independent. In this case we say it is linearly dependent. □ 

In terms of spanning, a set of vectors is linearly independent if it does not contain 
unnecessary vectors. In the previous example you can see that the vector w does not help to 
span any new vector not already in the span of the other two vectors. However you can verify 
that the set {u, u} is linearly independent, since both are required to span the W-plane. 

Consider the following important theorem. 


Theorem 4.62: Linear Independence as a Linear Combination 


The collection of vectors {u \ , • • • ,Uk} in is linearly independent if and only if 
whenever 

n 

^ ' Ojitj 0 
i = 1 

it follows that each a* = 0. 

In other words, {H\, • • • , Uk} in M. n is linearly independent exactly when the system of 
linear equations AX = 0 has only the trivial solution, where A is the n x k matrix 
having these vectors as columns. 


Proof. Suppose first {u\, ■ ■ ■ ,Uk} is linearly independent. Then by Definition 4.90 none of 
the vectors is a linear combination of the others. Now suppose for the sake of a contradiction 
that 

n 

''f ^ UiUi 0 
i= 1 

and not all the a* = 0. Then pick the a* which is not zero and divide this equation by it. 
Solve for Hi in terms of the other Uj, contradicting the fact that none of the fq equals a linear 
combination of the others. Therefore if the set of vectors is linearly independent and a linear 
combination of these vectors equals the zero vector, then all the coefficients must equal zero. 

Now suppose a linear combination of the vectors equals the zero vector such that all 
coefficients equal zero. We want to show that the vectors arc linearly independent. If u t is a 
linear combination of the other vectors in the list, then you could obtain an equation of the 
form 

Ui = a jUj 


and so we could write 


3 = E 


Ojj Uj T ( 1) Ui 


contradicting the condition that all coefficients equal 0. 
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Finally observe that the expression Y^i=i = 0 can be written as the system of linear 
equations AX = 0 where A is the n x k matrix having these vectors as columns. □ 

Sometimes we refer to this last condition about sums as follows: The set of vectors, 
{«i, • • • ,Uk} is linearly independent if and only if there is no nontrivial linear combination 
which equals zero. A nontrivial linear combination is one in which not all the scalars equal 
zero. Similarly, a trivial linear combination is one in which all scalars equal zero. 

We can say that a set of vectors is linearly dependent if it is not linearly independent, 
and hence if at least one vector is a linear combination of the others. 

Here is a detailed example in M 1 * * 4 . 



Solution. Form the 4x4 matrix A having these vectors as columns: 


12 0 3 

2 11 2 

3 0 1 2 

0 12-1 


Then by Theorem 4.62, the given set of vectors is linearly independent exactly if the system 
AX = 0 has only the trivial solution. 

The augmented matrix for this system and corresponding reduced row-echelon form are 
given by 


"12 0 3 

0 ' 


"10 0 1 

0 ' 

2 11 2 

0 


0 10 1 

0 

3 0 1 2 

0 

0 0 1-1 

0 

0 12-1 

0 


0 0 0 0 

0 


Not all the columns of the coefficient matrix are pivot columns and so the vectors are not 
linearly independent. In this case, we say the vectors are linearly dependent. 

It follows that there are infinitely many solutions to AX = 0, one of which is 


1 

1 

-1 

-1 
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Therefore we can write 


1 


2 


0 


3 


0 

2 

+ 1 

1 

0 

- 1 

1 

- 1 

2 


0 

3 


1 


2 


0 

0 


1 


2 


-1 


0 


This can be rearranged as follows 


1 


2 


0 


3 

2 

+ 1 

1 

0 

- 1 

1 


2 

3 


1 


2 

0 


1 


2 


-1 


This gives the last vector as a linear combination of the first three vectors. 

Notice that we conld rearrange this equation to write any of the four vectors as a linear 
combination of the other three. □ 

Consider another example. 



Solution. In this case the matrix of the corresponding homogeneous system of linear equa- 
tions is 


" 1 

2 

0 

3 

1 

o 

2 

1 

1 

2 

0 

3 

0 

1 

2 

0 

o 

1 

1 

2 

0 

1 

o 


The reduced row-echelon form is 


1 

0 

0 

0 

1 

o 

0 

1 

0 

0 

0 

0 

0 

1 

0 

0 

o 

1 

0 

0 

1 

1 

o 


and so every column is a pivot column. Therefore, these vectors are linearly independent 
and there is no way to obtain one of the vectors as a linear combination of the others. □ 
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The following corollary follows from the fact that if the augmented matrix of a homo- 
geneous system of linear equations has more columns than rows, the system has infinitely 
many solutions. 


Corollary 4.65: Linearly Dependence in 


Let {iti, • • • , Uk} be a set of vectors in M n . If k > n, then the set is linearly dependent 
(i.e. NOT linearly independent). 


Proof. Form the n x k matrix A having the vectors {h/, • • • , u/,} as its columns and suppose 
k > n. Then A has rank r < n < k, so the system AX = 0 has a nontrivial solution by 
Theorem 1.35, and thus not linearly independent by Theorem 4.62. □ 


4.10.3. A Short Application to Chemistry 


The following section applies the concepts of spanning and linear independence to the subject 
of chemistry. 

When working with chemical reactions, there are sometimes a large number of reactions 
and some are in a sense redundant. Suppose you have the following chemical reactions. 

CO + \0 2 co 2 

H 2 + H 2 0 

CH 4 + | O 2 -a CO + 2 H 2 0 
CH 4 + 2 Oo -> C0 2 + 2 H 2 0 

There are four chemical reactions here but they are not independent reactions. There is 
some redundancy. What are the independent reactions? Is there a way to consider a shorter 
list of reactions? To analyze this situation, we can write the reactions in a matrix as follows 


CO 

0 2 

co 2 

h 2 

h 2 o 

ch 4 

1 

1/2 

-1 

0 

0 

0 

0 

1/2 

0 

1 

-1 

0 

-1 

3/2 

0 

0 

-2 

1 

0 

2 

-1 

0 

-2 

1 


Each row contains the coefficients of the respective elements in each reaction. For exam- 
ple, the top row of numbers comes from CO + 1 0 2 — C0 2 = 0 which represents the first of 
the chemical reactions. 

We can write these coefficients in the following matrix 


1 

1/2 

-1 

0 

0 

0 

0 

1/2 

0 

1 

-1 

0 

1 

3/2 

0 

0 

-2 

1 

0 

2 

-1 

0 

-2 

1 
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Rather than listing all of the reactions as above, it would be more efficient to only list those 
which are independent by throwing out that which is redundant. We can use the concepts 
of the previous section to accomplish this. 

First, take the reduced row-echelon form of the above matrix. 

'1003-1-1' 

0 10 2-2 0 
0 0 14-2-1 
0 0 0 0 0 0 

The top three rows represent “independent” reactions which come from the original four 
reactions. One can obtain each of the original four rows of the matrix given above by taking 
a suitable linear combination of rows of this reduced row-echelon form matrix. 

With the redundant reaction removed, we can consider the simplified reactions as the 
following equations 

CO + 3 H 2 - 1 H 2 0 - 1CH 4 = 0 
0 2 + 2 H 2 - 2 H 2 0 = 0 
C0 2 + 4 Ho - 2 H 2 0 - 1 CH 4 = 0 

In terms of the original notation, these are the reactions 

CO + 3H 2 -> HoO + CH 4 
0 2 + 2 H 2 — y 2 H 2 0 
C0 2 + AH 2 ->• 2 H 2 0 + CH 4 

These three reactions provide an equivalent system to the original four equations. The 
idea is that, in terms of what happens chemically, you obtain the same information with the 
shorter list of reactions. Such a simplification is especially useful when dealing with very 
large lists of reactions which may result from experimental evidence. 

4.10.4. Subspaces and Basis 


A subspace is simply a set of vectors with the property that linear combinations of these 
vectors remain in the set. Geometrically, subspaces are represented by lines and planes which 
contain the origin. The precise definition is as follows. 



More generally this means that a subspace contains the span of any finite collection 
vectors in that subspace. It turns out that in M", a subspace is exactly the span of finitely 
many of its vectors. 


194 


Theorem 4.67: Subspaces are Spans 


Let V be a nonempty collection of vectors in W 1 . Then V is a subspace of W 1 if and 
only if there exist vectors {wi, ■■■ ,Uk} in V such that 

V = span {-Ui, • • • , Uk} 


Proof. Pick a vector u\ in V. If V — span{wi}, then you have found your list of vec- 
tors and are done. If V ^ span{wi}, then there exists u 2 a vector of V which is not in 
span{fti}. Consider span {tii, u 2 } . If V — span {wi, u 2 }, we are done. Otherwise, pick u 3 
not in span {u 1; u 2 } . Continue this way. Note that since V is a subspace, these spans are 
each contained in V. The process must stop with Uk for some k < n by Corollary 4.65. 

Now suppose V = span {u\^ ■ ■ ■ ,t4}, we must show this is a subspace. So let Y^l=i C A 
and Yli=] dA be two vectors in V, and let a and b be two scalars. Then 

k k k 

a cpli + b d l u l = (acj + bdf) Ui 

i— 1 2—1 2— 1 

which is one of the vectors in span {tb, • • • , Uk} and is therefore contained in V. This shows 
that span {«!, • • • ,Uk} has the properties of a subspace. □ 

Since the vectors Ui we constructed in the proof above are not in the span of the previous 
vectors (by definition), they must be linearly independent and thus we obtain the following 
corollary. 



In summary, subspaces of M n consist of spans of finite, linearly independent collections 
of vectors of ML. 

Note that it was just shown in Corollary 4.68 that every subspace of M n is equal to the 
span of a linearly independent collection of vectors of M n . Such a collection of vectors is 
called a basis. 
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Definition 4.69: Basis of a Subspace 


Let V be a subspace of W 1 . Then {wi, • • • , u k } is a basis for V if the following two 
conditions hold. 

1. span {t?!, • • • , u k j = V 

2. {-ui, • • • ,u k } is linearly independent 
Note the plural of basis is bases. 


The main theorem about bases is not only they exist, but that they must be of the same 
size. To show this, we will need the the following fundamental result, called the Exchange 
Theorem. 


Theorem 4.70: Exchange Theorem 


Suppose {ui,---,u r } is linearly independent and each u k is contained in 
span { (Ti , • • • ,n s } Then s > r. In words, spanning sets have at least as many vectors 
as linearly independent sets. 


Proof. Since {Ei, ■ • • , u s } is a spanning set, there exist scalars a l} such that 

S 

Uj = ^ ^ OijVi 
i=l 

Suppose for a contradiction that s < r. Then the matrix A = [a t j] has fewer rows, s than 
columns, r. By Theorem 1.35 there exists a non trivial solution d such that d 0 but 
Ad = 0. In other words, 

r 

ciijdj = 0, i — 1, 2, • ■ • , s 
j = i 

Therefore, 




3 = 1 


dj OijVi 
j = i »= i 


*= 1 \ 3 = 1 


'3 I Vi 


Qvj = o 

2=1 


which contradicts the assumption that {u\, ■ ■ ■ ,u r } is linearly independent, because not all 
the dj = 0. Thus this contradiction indicates that s > r. □ 


We are now ready to show that any two bases are of the same size. 
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Theorem 4.71: Bases of M n are of the Same Size 


Let V be a subspace of M n and suppose {hi, • • • , Uk} and {hi, ■ ■ ■ , v m } are two bases 
for V. Then k = m. 


Proof. This follows right away from Theorem 9.27. Indeed observe that {hi , • • ■ , h^} is a 
spanning set while {hi, • • • ,v m } is linearly independent so k > m. Also {hi, • • • ,v m } is a 
spanning set while {hi, • • • ,Uk} is linearly independent so m > k. □ 

The following definition can now be stated. 


Definition 4.72: Dimension of a Subspace 


Let V be a subspace of M ra . Then the dimension ofV, written dim(V) is defined to 
be the number of vectors in a basis. 


The next result follows. 



Proof. You only need to exhibit a basis for M n which has n vectors. Such a basis is 
{e\, • • • , e n }. □ 


We conclude this section by stating further properties of a set of vectors in M n . 



Proof. Assume first that {hi, • • • , h n } is linearly independent, and we need to show that 
this set spans M". To do so, let v be a vector of M n , and we need to write v as a linear 
combination of uf s. Consider the matrix A having the vectors h* as columns: 

A = [ hi • • • u n ] 

By linear independence of the uf s, the reduced row-echelon form of A is the identity matrix. 
Therefore the system Ax = v has a (unique) solution for all x, so v is a linear combination 
of the uf s. 
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To establish the second claim, suppose that m < n. Then letting v tl , • ■ • , v ik be the pivot 
columns of the matrix 

[vi ■■■ v m ] 

it follows k < m < n and these k pivot columns would be a basis for having fewer than 
n vectors, contrary to Corollary 4.73. 

Finally consider the third claim. If {7?!, • • • , v n } is not linearly independent, then replace 
this list with {v n , • • • , Vi k } where these are the pivot columns of the matrix 

[Vl ■■■ Vn] 

Then {v^ , • • • , Vi k } spans M n and is linearly independent so it is a basis having less than n 
vectors again contrary to Corollary 4.73. □ 


4.10.5. Row Space, Column Space, and Null Space of a Matrix 


We begin this section with a new definition. 



Using the reduced row-echelon form , we can obtain an efficient description of the row 
and column space of a matrix. Of course the column space can be obtained by simply saying 
that it equals the span of all the columns. However, you can often get the column space as 
the span of fewer columns than this. This is what we mean by an efficient description. The 
next example illustrates this concept. 


r i 

Example 4.76: Rank, Column and Row Space 

Find the rank of the following matr 
ciently. 

A = 

ix and describe i 

"12132" 
1 3 6 0 2 

3 7 8 6 6 

the column and row spaces effi- 


Solution. The reduced row-echelon form of A is 

"10-9 92 

01 5-30 

0 0 0 0 0 


Therefore, the rank is 2. 
are in 


Notice that all columns of this reduced row-echelon form matrix 
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For example, 


" -9 ' 


' 1 ' 


" 0 ' 

5 

0 

= -9 

0 

0 

+ 5 

1 

0 


Since the original matrix and its reduced row-echelon form are equivalent, all columns of the 
original matrix are similarly contained in the span of the first two columns of that matrix. 
For example, consider the third column of the original matrix. It can be written as a linear 
combination of the first two columns of the original matrix as follows. 


" 1 ' 


' 1 ' 


" 2 ' 

6 

8 

= -9 

1 

3 

+ 5 

3 

7 


The column space of the original matrix equals the span of the first two columns of the 
original matrix. This is the desired efficient description of the column space. 


r 

' 1 ' 


" 2 ' 

I 

col(A) = span < 

1 


3 

{ 

l 

3 


7 

1 


What about an efficient description of the row space? When row operations are used, the 
resulting vectors remain in the row space. Thus the rows in the reduced row-echelon form are 
in the row space of the original matrix. Furthermore, by reversing the row operations, each 
row of the original matrix can be obtained as a linear combination of the rows in the reduced 
row-echelon form. It follows that the span of the nonzero rows in the reduced row-echelon 
form equals the span of the original rows. For the above matrix, the row space equals 

row(A) = span { [ 1 0 -9 9 2 ] , [ 0 1 5 -3 0 ] } 


Notice that the column space of A is given as the span of columns of the original matrix, 
while the row space of A is the span of rows of the reduced row-echelon form of A. 
Consider another example. 


r 

Example 4.77: Rank, Column and Row Space 

Find the rank of the following mi 
ciently. 

rtrix and describ 

'12132' 
1 3 6 0 2 

12 13 2 

1 3 2 4 0 

e the column and row spaces effi- 


Solution. The reduced row-echelon form is 

' 1 0 0 0 f ' 

0 10 2 — | 

0 0 1-1 | 

0 0 0 0 0 
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and so the rank is 3. The row space is given by 

row(R) = span { [ 1 0 0 0 f ],[0 1 0 2 — | ] , [ 0 0 1 -1 | ] } 

Notice that the first three columns of the reduced row-echelon form are pivot columns. The 
column space is the span of the first three columns in the original matrix, 


col (A) = span 


1 

1 

1 

1 



1 

6 

1 

2 


□ 


Consider the solution given above for Example 4.77, where the rank of A equals 3. Notice 
that the row space and the column space each had dimension equal to 3. It turns out that 
this is not a coincidence. This essential result is referred to as the Rank Theorem and is 
given now. 



The following statements are results of the Rank Theorem. 



Consider the following example. 
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Solution. To find rank(A) we first row reduce to find the reduced row-echelon form. 


A = 


1 2 

-1 1 


— >■ 




1 0 
0 1 


Therefore the rank of A is 2. Now consider A T given by 

A T = 


1 

2 


Again we row reduce to find the reduced row-echelon form. 


1 

2 




->• 


1 0 
0 1 


You can see that rank(A T ) = 2, the same as rank(A). 

We now dehne what is meant by the null space of a general m x n matrix. 


Definition 4.81: Null Space, or Kernel, of A 


The null space of a matrix A, also referred to as the kernel of A, is defined as follows. 

ker ( A ) = { x : Ax = 0 


□ 


It is also referred to quite often using the notation N (A) , the N signifying null space. 
To find ker (A) one must solve the system of equations Ax = 0. This is a familiar procedure. 

Similarly, we can discuss the image of A, denoted by ini (A) . The image of A consists of 
the vectors of M m which “get hit” by A. The formal definition is as follows. 



Consider A as a mapping from M n to whose action is given by multiplication. The 
following diagram displays this scenario. 

ker (A) irn(A) 

W 1 4 M m 

As indicated, im (A) is a subset of M m while ker (A) is a subset of M n . 

Finding ker (A) is not new! There is just some new terminology being used, ker (A) is 
simply the solution to the system Ax = 0. 

Consider the following example. 
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Example 4.83: Null Space, or Kernel of A 


Let 


A = 


1 2 1 
0 -1 1 
2 3 3 


Find ker (A) . 


Solution. In order to find ker (A), we need to solve the equation Ax = 0. This is the usual 
procedure of writing the augmented matrix, finding the reduced row-echelon form and then 
the solution. The augmented matrix and corresponding reduced row-echelon form are 


" 1 2 1 

0 ' 


'10 3 

0 ' 

0 -1 1 

0 


0 1 -1 

0 

2 3 3 

0 


0 0 0 

0 


The third column is not a pivot column, and therefore the solution will contain a parameter. 
The solution to the system Ax = 0 is given by 


3 1 
t 
t 


! t 6 M 


which can be written as 



! t 6 M 


Therefore, the null space of A is all multiples of this vector, which we can write as 


ker (A) = span 


3 

1 

1 


□ 


Here is a larger example, but the method is entirely similar. 
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Solution. To find the null space, we need to solve the equation AX = 0. The augmented 
matrix and corresponding reduced row-echelon form are given by 


■ 1 

2 

1 

0 

1 

0 ' 


' 1 

0 

3 

5 

6 

5 

1 

5 

0 ' 

2 

-1 

1 

3 

0 

0 


0 

1 

1 

3 

2 

0 






0 




5 

5 

5 


3 

1 

2 

3 

1 







4 

-2 

2 

6 

0 

0 


0 

0 

0 

0 

0 

0 








0 

0 

0 

0 

0 

0 


It follows that the first two columns are pivot columns, and the next three correspond to 
parameters. Therefore, ker (A) is given by 


H 


» + (-!)* + (ih 


s + (l) i + (- 


t 

r 


: s,t,r G 


We write this in the form 


3 ■ 


6 ■ 


i - 

5 


5 


5 

1 


3 


2 

5 


5 


5 

1 

+ t 

0 

+ r 

0 

0 


1 


0 

0 


0 


1 


: s,t,r £ 


In other words, the null space of this matrix equals the span of the three vectors above. Thus 


ker (A) = span < 


/ 

3 ' 


6 ' 


1 ■ 



5 


5 


5 



1 


3 


2 



5 


5 


5 



1 

9 

0 

1 

0 



0 


1 


0 


< 

0 


0 


1 

y 


□ 

Notice also that the three vectors above are linearly independent and so the dimension 
of ker (A) is 3. The following is true in general. The number of free variables equals the 
dimension of the null space while the number of basic variables equals the number of pivot 
columns which is the rank. 

Before we proceed to an important theorem, we first define what is meant by the nullity 
of a matrix. 


Definition 4.85: Nullity 


The dimension of the null space of a matrix is called the nullity, denoted null (A). 
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We can now state an important theorem. 



Consider the following example, which we first explored above in Example 4.83 



Solution. In the above Example 4.83 we determined that the reduced row-echelon form of A 
is given by 

"10 3 | 0 

0 1 -1 | 0 
0 0 0 I 0 


Therefore the rank of A is 2. We also determined that kernel of A is given by 

ker(kl) = span 


3 

1 

1 


Therefore the nullity of A is 1. It follows from Theorem 4.86 that rank (A) + null (A) = 
2 + 1 = 3, which is the number of columns of A. □ 


4.10.6. Exercises 


1. Suppose {xi, • • • , Xkj is a set of vectors from M ra . Show that 0 is in span {xi, • • • , Xk} ■ 


2. Let H = span 


determine a basis. 


3. Let H denote span 


2 

1 

1 

1 


-1 

0 

-1 

-1 


5 

2 

3 

3 


0 

1 

1 

-1 


-1 

-1 

-2 

2 


-1 

1 

-2 

-2 


2 

3 

5 

-5 


Find the dimension of H and 


0 

1 

2 

-2 


. Find the dimension of H 


and determine a basis. 
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4. Let H denote span 

H and determine a basis. 


' -2 " 


" -9 ' 


' -33 ' 


" -22 " 

) 

1 


4 


15 


10 

\ 

1 

9 

3 

9 

12 

9 

8 

| 

-3 


-9 


-36 


-24 



Find the dimension of 


5. Let H denote span 


-1 

1 

-1 

-2 


sion of H and determine a basis 


-4 

3 

-2 

-4 


6. Let H denote span 

H and determine a basis. 


7. Let H denote span 
and determine a basis. 


8. Let H denote span 
of H and determine a basis. 

9. Let H denote span 
determine a basis. 



/ 

Ml 

10. Let M = < 

u — 

< 

/ 

u 3 

«4 

Ml 

11. Let M = < 

u = 

u 2 

u 3 

m 4 


-3 

2 

-1 

-2 


-1 

1 

-2 

-4 


-7 

5 

-3 

-6 


. Find the dimen- 


" 2 ' 


8 ' 


" 3 ' 


" 4 ' 


8 ' 

1 

3 


15 


6 


6 


15 

1 

2 

9 

6 

1 

2 

9 

6 

9 

6 

| 

1 


3 


1 


3 


3 



Find the dimension of 


0 ' 


" -1 ' 


" -2 ' 


" -3 ' 

1 

2 


6 


16 


22 

1 

0 

9 

0 

9 

0 

9 

0 

( 

-1 


-2 


-6 


-8 



. Find the dimension of H 


‘ 5 ' 


" 14 ' 


' 38 ' 


" 47 ' 


" 10 " 

1 


3 


8 


10 


2 

1 

9 

2 

9 

6 

9 

7 

9 

3 

4 


8 


24 


28 


12 

basis. 






' 6 " 


' 17 ' 


' 52 ' 


' 18 " 

\ 



1 


3 


9 


3 


> 

Find 

1 

9 

2 

9 

7 

9 

4 



5 


10 


35 


20 

> 




. Find the dimension 


. Find the dimension of H and 


: sin (mi) = 1 > . Is M a subspace? Explain. 


G M 4 : |«i| < 4 > . Is M a subspace? Explain. 
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plain. 


i 

Ml 

Uo 

r 

£ £ 

^ CO 1 


6 R 4 : > 0 for each i — 1, 2, 3, 4 > . Is M a subspace? Ex- 


13. Let w,w i be given vectors in M 4 and define 


r 

Ml 

- 

U 2 

“ = 

u 3 

1 

U 4 


G M 4 : w • u = 0 and W\ • u — 0 V . 


Is M a subspace? Explain. 


f 

Mi 

- 

U 2 

r 

u 3 

1 

M 4 


p4 . 


w»u = 0>.IsM& subspace? Explain. 


f 

Mi 


U2 

r = 

u 3 

1 

M4 

f 

Mi 

- 

«2 

r 

u 3 

1 

M4 


6 l 4 : «3 > «i > . Is M a subspace? Explain. 


G M 4 : M 3 = U\ = 0 } . Is M a subspace? Explain. 
17. Consider the set of vectors S given by 


5 = 


4 u + v — 5 w 
12 u + 6m — 6 w 
4 u + 4m + Aw 


: u,v,w G 


Is 5 a subspace of M 3 ? If so, explain why, give a basis for the subspace and find its 
dimension. 

18. Consider the set of vectors S given by 


5 = 


Is S a subspace of M 4 ? If so, explain why, give a basis for the subspace and find its 
dimension. 


f 

2m + 6m + 7w 

1 


—3m — 9m — 12 w 

: m, m, w G R / 


2m + 6m + 6w 

l 

m + 3m + 3w 

J 
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19. Consider the set of vectors S' given by 


S 


2 m + u 

6u — 3u + 3w 
3v — 6m + 3w 


u,v,w G 


Is this set of vectors a subspace of M 3 ? If so, explain why, give a basis for the subspace 
and find its dimension. 


20. Consider the vectors of the form 

2m + v + 7w 
u — 2v + w 
— 6u — 6 w 

Is this set of vectors a subspace of M 3 ? If so, explain why, give a basis for the subspace 
and find its dimension. 

21. Consider the vectors of the form 

3 u + v + 11 w 
18m + 6n + 66 w 
28 u + 8m + 100m; 

Is this set of vectors a subspace of M 3 ? If so, explain why, give a basis for the subspace 
and find its dimension. 




22. Consider the vectors of the form 

3m + m 
2 w — 4m 
2w — 2m — 8m 

Is this set of vectors a subspace of M 3 ? If so, explain why, give a basis for the subspace 
and find its dimension. 



23. Consider the set of vectors S given by 


M + M + W 

2u + 2m + 4 w 
u + m + w 

0 


: m, m, w G 


Is S' a subspace of M 4 ? If so, explain why, give a basis for the subspace and find its 
dimension. 

24. Consider the set of vectors S given by 


—3m — 3m; 
8m — 4m + 4m; 


: m, m, w G 


Is S' a subspace of M 3 ? If so, explain why, give a basis for the subspace and find its 
dimension. 
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25. If you have 5 vectors in M 5 and the vectors are linearly independent, can it always be 
concluded they span M 5 ? Explain. 

26. If yon have 6 vectors in M 5 , is it possible they are linearly independent? Explain. 

27. Suppose A is an m x n matrix and {w \ , • • • , Wk} is a linearly independent set of vectors 
in A (M n ) C M m . Now suppose Az t = Wi. Show {z±, • • • , Zk] is also independent. 

28. Suppose V, W are subspaces of M n . Let V D W be all vectors which are in both V and 
W. Show that V D W is a subspace also. 

29. Suppose V and W both have dimension equal to 7 and they are subspaces of M 10 . 
What are the possibilities for the dimension of V D WI Hint: Remember that a linear 
independent set can be extended to form a basis. 

30. Suppose V has dimension p and W has dimension q and they are each contained in 
a subspace, U which has dimension equal to n where n > max (p, q ) . What are the 
possibilities for the dimension of V fllb? Hint: Remember that a linearly independent 
set can be extended to form a basis. 

31. Suppose A is an m x n matrix and B is an n x p matrix. Show that 

dim (ker (AB)) < dim (ker (A)) + dim (ker (B)) . 

Hint: Consider the subspace, B (M p ) D ker (A) and suppose a basis for this subspace 
is {wli, • • • , Wk} • Now suppose {ui, • • • , u r } is a basis for ker ( B ) . Let {z \ , • • • , Zk} be 
such that Bzi = Wi and argue that 

ker (AB) C span {wi, • • • , u r , zi, ■ ■ ■ , Zk } . 


32. Show that if A is an m x n matrix, then ker (A) is a subspace of M™. 

33. Find the rank of the following matrix. Also find a basis for the row and column spaces. 

" 1 3 0 -2 0 3 

3 9 1 -7 0 8 

13 1-3 1-1 

_ 1 3 -1 -1 -2 10 _ 

34. Find the rank of the following matrix. Also find a basis for the row and column spaces. 

" 1 3 0 -2 7 3 " 

3 9 1 -7 23 8 

13 1-392 

13-1-1 5 4 
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35. Find the rank of the following matrix. Also find a basis for the row and column spaces. 


"1 03 070 ' 

3 1 10 0 23 0 

1 14 17 0 

1-1 2-2 9 1 

36. Find the rank of the following matrix. Also find a basis for the row and column spaces. 

"1 0 3" 

3 1 10 

1 1 4 
1-12 

37. Find the rank of the following matrix. Also find a basis for the row and column spaces. 

"00-10 1 ' 

12 3-2 -18 

12 2-1 -11 
-1 -2 -2 1 11 

38. Find the rank of the following matrix. Also find a basis for the row and column spaces. 

1 0 3 0 " 

3 1 10 0 

-1 1-2 1 

1-1 2-2 

39. Find ker (A) for the following matrices. 

2 3 
4 6 

10-1" 

-1 1 3 

3 2 1 

2 4 0 ' 

3 6-2 
12-2 

2-1 3 5 ' 

2 0 12 

6 4-5-6 

0 2-4-6 


(a) A = 

(b) A = 

(c) A = 

(d) A = 
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4.11 Orthogonality and the Gram Schmidt 
Process 


Outcomes 


A. Determine if a given set is orthogonal or orthonormal. 

B. Determine if a given matrix is orthogonal. 

C. Given a linearly independent set , use the Gram-Schmidt Process to find corre- 
sponding orthogonal and orthonormal sets. 

D. Find the orthogonal projection of a vector onto a subspace. 

E. Find the least squares approximation for a collection of points. 


4.11.1. Orthogonal and Orthonormal Sets 


In this section, we examine what it means for vectors (and sets of vectors) to be orthogonal 
and orthonormal. First, it is necessary to review some important concepts. You may recall 
the definitions for the span of a set of vectors and a linear independent set of vectors. We 
include the definitions and examples here for convenience. 


Definition 4.88: Span of a Set of Vectors and Subspace 


The collection of all linear combinations of a set of vectors {u\, ■ ■ ■ ,Uk} in K" is known 
as the span of these vectors and is written as span{Ei, • • • , u^}. 

We call a collection of the form span{u\, • • • , Uk} a subspace ofW 1 . 


Consider the following example. 



Solution. You can see that any linear combination of the vectors u and v yields a vector 

T 

[ x y 0 ] in the XY-plane. 


210 


Moreover every vector in the W-plane is in fact such a linear combination of the vectors 
u and v. That’s because 


X 


' 1 ' 


' 3 ' 

y 

= {-2x + 3 y) 

1 

+ (x-y) 

2 

0 


0 


0 


Thus span{w, v} is precisely the W-plane. □ 

The span of a set of a vectors in is what we call a subspace of M n . A subspace W 
is characterized by the feature that any linear combination of vectors of W is again a vector 
contained in W. 

Another important property of sets of vectors is called linear independence. 


Definition 4.90: Linearly Independent Set of Vectors 


A set of non- zero vectors { u | , • • • , u k } in M" is said to be linearly independent if 
no vector in that set is in the span of the other vectors of that set. 


Here is an example. 


Example 4.91: Linearly Independent Vectors 


Consider vectors w = [ 1 1 0 ] T , n = [ 3 2 0 ] T , and w — [ 4 5 0 ] T e M 3 . 
Verify whether the set {u, v. , w} is linearly independent. 


Solution. We already verified in Example 4.89 that span{ff, v} is the W-plane. Since w is 
clearly also in the AW-plane, then the set {u,v,w} is not linearly independent. □ 

In terms of spanning, a set of vectors is linearly independent if it does not contain 
unnecessary vectors. In the previous example you can see that the vector w does not help 
to span any new vector not already in the span of the other two vectors. However you can 
verify that the set {u, n} is linearly independent, since you will not get the W-plane as the 
span of a single vector. 

We can also determine if a set of vectors is linearly independent by examining linear 
combinations. A set of vectors is linearly independent if and only if whenever a linear 
combination of these vectors equals zero, it follows that all the coefficients equal zero. It 
is a good exercise to verify this equivalence, and this latter condition is often used as the 
(equivalent) definition of linear independence. 

If a subspace is spanned by a linearly independent set of vectors, then we say that it is 
a basis for the subspace. 
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Definition 4.92: Basis of a Subspace 


Let V be a subspace of ML. Then {{?!, • • • ,u k } is a basis for V if the following two 
conditions hold. 

1. span {t?!, • • • , u k } = V 

2. {«!, • • • , u k } is linearly independent 


Thus the set of vectors {u,v} from Example 4.91 is a basis for AW-plane in M 3 since it 
is both linearly independent and spans the XY- plane. 

We can now discuss what is meant by an orthogonal set of vectors. We saw in a previous 
section (see Proposition 4.34) that two vectors u and v are orthogonal if u»v = 0. This idea 
can be extended to a set of vectors as follows. 



If we have an orthogonal set of vectors and normalize each vector so they have length 1, 
the resulting set is called an orthonormal set of vectors. They can be described as follows. 



Note that all orthonormal sets are orthogonal, but the reverse is not necessarily true 
since the vectors may not be normalized. In order to normalize the vectors, we simply need 
divide each by its length. 

If an orthogonal set is a basis for a subspace, we call this an orthogonal basis. Similarly, 
if an orthonormal set is a basis, we call this an orthonormal basis. 
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Example 4.95: Orthonormal Set 


Consider the set of vectors given by 

{ui,u 2 j = 

Show that it is an orthogonal set of vectors but not an orthonormal one. Find the 
corresponding orthonormal set. 



Solution. One easily verifies that U\ • u 2 — 0 and {ui,u- 2 } is an orthogonal set of vectors. 
On the other hand one can compute that ||'Ui || = ||-u 2 || = 72 7 1 and thus it is not an 
orthonormal set. 

Thus to find a corresponding orthonormal set, we simply need to normalize each vector. 
We will write {wi,W 2 \ for the corresponding orthonormal set. Then, 


Similarly, 


w l 


1 _ 

IMI 

j_ r 1 ' 
72 [l. 

r j_ -| 

72 

1 

72 


W 2 



J_ -1 

72 L 1 . 

p i_ -i 

72 

1 

72 


Therefore the corresponding orthonormal set is 


{w u w 2 } 


i 

72 

1 

72 



You can verify that this set is orthogonal. 


□ 


Consider an orthonormal set of vectors in M n , written {w\, ■ ■ • ,Wk} with k < n. The 
span of these vectors is a subspace W of M n . If we could show that this orthonormal set 
is also linearly independent, we would have a basis of W . We will show this in the next 
theorem. 
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Theorem 4.96: Orthonormal Basis of a Subspace 


Let {tiJi, w 2 , ■ ■ ■ ,Wk} be an orthonormal set of vectors in M n . Then this set is linearly 
independent hence forms a basis for the subspace W = span{wi, w 2 , • • • , Wk}- 


Proof. To show it is a linearly independent set, suppose a linear combination of these vectors 
equals 0, such as: 

diWi + a 2 w 2 + • • • + dkWk = 0, a* G M 

We need to show that all a* = 0. To do so, take the dot product of the vector w\ and the 
above sum. 


w\ • (aiwi + a 2 w 2 H b a k Wk ) = wi • 0 

di(wi • Wi) + a 2 (wi • w 2 ) H b d k {wi • w k ) = 0 

Now since the set is orthonormal, W\ • w m = 0 for all m ^ 1, so we have: 

di(wi • wi) + a 2 (0) H b a fc (0) = 0 

ai||uJ 1 || 2 = 0 

Since the set is orthonormal, we know that ||wi|| 2 = 1. It follows that d\ = 0. 

We can continue in this fashion for the rest of the vectors in the set, and determine that 
di = 0 for all i — 1, 2, • • • , k. Therefore the set {wi, w 2 , ■ ■ ■ , Wk} is linearly independent. 

Finally since W = span{wi, w 2 , ■ ■ ■ ,uik}, the set of vectors also spans W and therefore 
forms a basis of W. 

□ 

4.11.2. Orthogonal Matrices 


Recall that the process to find the inverse of a matrix was often cumbersome. In contrast, 
it was very easy to take the transpose of a matrix. Luckily for some special matrices, the 
transpose equals the inverse. When an n x n matrix has all real entries and its transpose 
equals its inverse, the matrix is called an orthogonal matrix. 

The precise definition is as follows. 


Definition 4.97: Orthogonal Matrices 


A real n x n matrix U is called an orthogonal matrix ifUU 1 = U T U = I. 


Note that by Theorem 2.63 it suffices to verify only one of these equalities UU 1 = I or 
U T U = I. 

Consider the following example. 
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Solution. All we need to do is verify (one of the equations from) the requirements of Definition 
4.97. 


r i 

73 

1 -| 

72 


r 1 

72 

1 1 
72 


' 1 

0 ' 

i 

_ 72 

1 

72 . 


1 

_ 72 

1 

72 . 


0 

1 


Since UU T = /, this matrix is orthogonal. □ 

Here is another example. 


r i 

Example 4.99: Orthogonal Matrix 

Let U = 

' 1 0 O' 

0 0-1 

0-1 0 

. Is U orthogonal? 


Solution. Again the answer is yes and this can be verified simply by showing that U T U = I: 


U T U 


1 0 0 

0 0-1 
0-1 0 

1 0 0 

0 0-1 
0-1 0 

10 0 ' 

0 1 0 

0 0 1 


1 0 0 
0 0-1 
_ 0 -1 0 

1 0 o' 

0 0-1 
0-1 0 


□ 


When we say that U is orthogonal, we are saying that 

^ ^ U'ij'Ujk ^ ^ V'ij'U'kj $ik 

j j 

In words, the product of the i th row of U with the k th row gives 1 if % — k and 0 if i ^ k. 
The same is true of the columns because U T U = I also. Therefore, 

^ ^ ^ ^ Uji'Ujk &ik 

j j 
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which says that the product of one column with another column gives 1 if the two columns 
are the same and 0 if the two columns are different. 

More succinctly, this states that if U\, ■ ■ ■ ,u n are the columns of U , an orthogonal matrix, 
then 

_ _ s f 1 if i — j 

Ui • Uj - S {j - | Q if . ^ . 

We will say that the columns form an orthonormal set of vectors, and similarly for 
the rows. Thus a matrix is orthogonal if its rows (or columns) form an orthonormal 
set of vectors. Notice that the convention is to call such a matrix orthogonal rather than 
orthonormal (although this may make more sense!). 


Proposition 4.100: Orthonormal Basis 


The rows of an n x n orthogonal matrix form an orthonormal basis of W 1 . Further, 
any orthonormal basis of W 1 can be used to construct an n x n orthogonal matrix. 


Proof. Recall from Theorem 4.96 that an orthonormal set is linearly independent and forms 
a basis for its span. Since the rows of an n X n orthogonal matrix form an orthonormal 
set, they must be linearly independent. Now we have n linearly independent vectors, and it 
follows that their span equals M n . Therefore these vectors form an orthonormal basis for M n . 

Suppose we have an orthonormal basis for M n . Since the basis will contain n vectors, 
these can be used to construct an n x n matrix, with each vector becoming a row. Therefore 
the matrix is composed of orthonormal rows, which by our above discussion, means that the 
matrix is orthogonal. □ 

Consider the following proposition. 



Proof. This result follows from the properties of determinants. Recall that for any matrix 
A, det(kL) T = det(.A). Consider 

(det ( U )) 2 = det ( U T ) det (U) = det ( U T U ) = det (/) = 1 
Therefore (det(R)) 2 = 1 and it follows that det (U) = ±1. □ 

Orthogonal matrices are divided into two classes, proper and improper. The proper or- 
thogonal matrices are those whose determinant equals 1 and the improper ones are those 
whose determinants equal —1. The reason for the distinction is that the improper orthog- 
onal matrices are sometimes considered to have no physical significance. These matrices 
cause a change in orientation which would correspond to material passing through itself in a 
non physical manner. Thus in considering which coordinate systems must be considered in 
certain applications, you only need to consider those which are related by a proper orthog- 
onal transformation. Geometrically, the linear transformations determined by the proper 
orthogonal matrices correspond to the composition of rotations. 
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4.11.3. Gram-Schmidt Process 


The Gram-Schmidt process is an algorithm to transform a set of vectors into an orthonormal 
set generating the same collection of linear combinations (see Definition 1.32). 

The goal of the Gram-Schmidt process is to take a linearly independent set of vectors and 
transform it into an orthonormal set with the same span. The first objective is to construct 
an orthogonal set of vectors with the same span, since from there an orthonormal set can be 
obtained by simply dividing each vector by its length. 



Proof. The full proof of this algorithm is beyond this course. However to give you an idea 
that {Pi , • • • , v n } is an orthogonal set, let 


u 2 • Vi 



then: 


Vi • v 2 


= ui • [u 2 - a 2 v i) 

= vi*u 2 - a 2 (v i • Vi 


= V\ • u 2 


u 2 • Vi 

1 1 -Pi 1 1 2 


= Cg • U 2 ) - (u 2 • Ui) = 


0 


Now that you have shown that {hi, v 2 } is orthogonal, use the same method as above to show 
that {vi,v 2 ,V 3 } is also orthogonal, and so on. 
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Then in a similar fashion you show that span {wi, • • • , u n } = span {Fi, • • • ,v n }. 

Finally defining Wi = . . J . . for i = 1, ••• ,n does not affect orthogonality and yields 

INI 

vectors of length 1, hence an orthonormal set. You can also observe that it does not affect 
the span either and the proof would be complete. □ 

Consider the following example. 


Example 4.103: Find Orthonormal Set with Same Span 


Consider the set of vectors {u \ , u 2 } given as in Example 4.89. That is 



' 1 ' 


' 3 ' 

Ml = 

1 

, U2 ~ 

2 


0 


0 


Find an orthonormal set of vectors {wi,W 2 } having the same span. 


Solution. We already remarked that the set of vectors in u 2 } is linearly independent, so 
we can proceed with the Gram-Schmidt algorithm: 


V\ = Ui= 


v 2 = u 2 - 


1 
1 
0 

u 2 • hi 


V\ 


3 

2 

0 


5 

2 


1 

1 

0 
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Now simply let 


W\ 


V\ 


W-2 


V2 

1 

V2 

0 . 

V2 

HI 

r J_ 

V2 

1 

72 

0 


You can verify that {w 1 ,^ 2 } is an orthonormal set of vectors having the same span as 
{ui,u 2 }, namely the AW-plane. □ 


4.11.4. Orthogonal Projections 


An important use of the Gram-Schmidt Process is in orthogonal projections, the focus of 
this section. 

You may recall that a subspace of W 1 is a set of vectors which contains the zero vector, 
and is closed under addition and scalar multiplication. Let’s call such a subspace W. I 11 
particular, a plane in M n which contains the origin, (0, 0, • • • , 0), is a subspace of M n . 

Suppose a point Y in M n is not contained in W. What point Z in W is closest to Y? 
Using the Gram-Schmidt Process, we can find such a point. Let y,z represent the position 
vectors of the points Y and Z respectively, with y — z representing the vector connecting 
the two points Y and Z. It will follow that if Z is the point on W closest to Y, then y — z 
will be perpendicular to W ; in other words, y — z is orthogonal to W (and to every vector 
contained in W) as in the following diagram. 


Y 



The vector z is called the orthogonal projection of y on W. The definition is given as 
follows. 
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Definition 4.104: Orthogonal Projection 


Let W be a subspace of M n , and Y be any point in M n . Then the orthogonal projection 
of Y onto W is given by 


z = proj w (y) = 


yw i 


W 1 + 


yw 2 


w 2 H h 


yw m 


W r 


W r 


where {hq, w 2 , ■ ■ ■ , w m } is any orthogonal basis of W. 


Therefore, in order to find the orthogonal projection, we must first find an orthogonal 
basis for the subspace. Note that one conld use an orthonormal basis, but it is not necessary 
in this case since as you can see above the normalization of each vector is included in the 
formula for the projection. 

Before we explore this further through an example, we show that the orthogonal projec- 
tion yields a point Z (the point whose position vector is the vector z above) which is the 
point of W closest to Y. 



Proof. To show that Z is the point in W closest to Y, we wish to show that \y — zi\ > \y — z\ 
for all z\ 7^ z E W . We begin by writing y — z\ = (y — z) + (z — 5j) . Now, the vector y — z 
is orthogonal to W, and z — Z\ is contained in W. Therefore these vectors are orthogonal to 
each other. By the Pythagorean Theorem, we have that 

II v- ^ill 2 = ||y - ^1l 2 + ||*- ^ill 2 > II y- 51I 2 

This follows because z zj so ||z — zj|| 2 > 0. 

Hence, || y — zj|| 2 > || y — z\\ 2 . Taking the square root of each side, we obtain the desired 
result. □ 

Consider the following example. 



Solution. We must first find an orthogonal basis for W. Notice that W is characterized by 
all points (a, b , c) where c = 2b — a. In other words, 


W 


a 

b 

2b — a 


,a,6el 
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We can write W as 


W = span {«!, u 2 } 



Notice that this span is a basis of W as it is linearly independent. We will use the 
Gram-Schmidt Process to convert this to an orthogonal basis, {wi,w 2 }. In this case, it is 
only necessary to find an orthogonal basis, and it is not required that it be orthonormal. 


VJi = U\ = 


1 

0 

-1 


w 2 = u 2 - 

~ 0 ' 
1 
2 

0 ' 
1 
2 

1 ' 
1 
1 

Therefore an orthogonal basis of W is 

{wi,w 2 } = 


U 2 • Wi 


W 1 


1 

0 

-1 


+ 


1 

0 

-1 


1 

0 

-1 


We can now use this basis to find the orthogonal projection of the point Y = (1, 0, 3) on 

1 

^ i 

the subspace W . We will write the position vector y of Y as y = 


0 

3 


Using Definition 
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4.104, we continue as follows: 


2 = w°] W (y) 



- i - 

3 

4 

~ 3 

7 
3 

Therefore the point on W closest to the point (1,0,3) is (|, |, |). 


□ 


Recall that the vector y — z is perpendicular (orthogonal) to all the vectors contained in 
the plane W. Using a basis for W, we can in fact find all such vectors which are perpendicular 
to W. We call this set of vectors the orthogonal complement of W and denote it W L . 



In the next example, we will look at how to find W 1 . 


Example 4.108: Orthogonal Complement 


Let W be a plane given by points satisfying a — 2b + c = 0. Find the orthogonal 
complement ofW. 


Solution. 

From Example 4.106 we know that we can write W as 


W = span {wi, U 2 } = span 


1 

0 

-1 


0 

1 

2 


In order to find W - 1 , we need to find all x which are orthogonal to every vector in this 
span. 

r x\ 

X 2 ■ In order to satisfy x • U\ — 0, the following equation must hold. 

x 3 


Let x = 


X\ — x 3 = 0 
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In order to satisfy x • u 2 = 0, the following equation must hold. 

x 2 + 2x 3 = 0 

Both of these equations must be satisfied, so we have the following system of equations. 


x\ — x% = 0 
x 2 + 2x 3 = 0 


To solve, set up the augmented matrix. 


1 

0 

-1 

l 

O 

1 

0 

1 

2 

O 

l 


Using Gaussian Elimination, we find that W 1 = span 
Consider again the following diagram. 


1 

-2 

1 


□ 


Y 



Recall that Z is the point in W closest to the point Y . Notice that since the origin is 
contained in W, the position vector z of Z is contained in W . The vector y—z is in W U Now, 
let Z\ be any other point in W not equal to Z. Then it follows that the distance between 
Y and Z is shorter than that between Y and Z\ for all Z\ . These results are summarized in 
the following important theorem. 



We conclude this section with a final example. 
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Example 4.110: Vector Written as a Sum of Two Vectors 




' 1 ' 


" 0 ' 

1 


Let W be a subspace given by W = span < 


0 

1 


1 

0 


> , and Y = (1, 2, 3, 4). 


. 

0 


2 

. 


Find the point Z in W closest to Y, and moreover write y as the sum of a vector in 

W and a vector in W 1 . 





Solution. From Theorem 4.105, the point Z in W closest to Y is given by z — proj w (y)- 
Notice that since the above vectors already give an orthogonal basis for W, we have: 



Therefore the point in W closest to Y is Z — (2, 2, 2, 4). 


Now, we need to write y as the sum of a vector in W and a vector in W L . This can easily 
be done as follows: 

y = z + (y- z) 

since z is in W and as we have seen y — z is in W 1 . 

The vector y — z is given by 


" 1 ' 


1 

to 


1 

1 

h ^ 

1 

2 


2 


0 

3 


2 


1 

1 

1 


1 


1 

o 


Therefore, we can write y as 


' 1 ' 


" 2 ' 


" -1 ' 

2 


2 


0 

3 

— 

to 

+ 

i 

1 

1 


1 


1 

o 
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4.11.5. Least Squares Approximation 


It should not be surprising to hear that many problems do not have a perfect solution, and in 
these cases the objective is always to try to do the best possible. For example what does one 
do if there are no solutions to a system of linear equations Ax = bl It turns out that what 
we do is find x such that Ax is as close to b as possible. A very important technique that 
follows from orthogonal projections is that of the least square approximation, and allows us 
to do exactly that. 

We begin with a lemma. 



Proof. Let Ax and Ay be two vectors of A (M n ) . It suffices to verify that if a, b are scalars, 
then aAx + bAy is also in A (M n ) . But aAx + bAy = A (ax + by) because A is linear. Since 
(ax + by) is a vector in M n , it follows that A (ax + by) is in M m as required. □ 

The following theorem is a rewording of Theorem 4.109 using the subspace W = A (M n ) 
and gives the equivalence of an orthogonality condition with a minimization condition. The 
following picture illustrates this orthogonality condition and geometric meaning of this the- 
orem. 




We note a simple but useful observation. 
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Lemma 4.113: Transpose and Dot Product 


Let A be an m x n matrix. Then 

Ax • y — x • A T y 


Proof. This follows from the definitions: 

Ax»y = ^ a ij x jVi = x o a pVi = x • aT V 

hi hi 


□ 


The next corollary gives the technique of least squares. 


Corollary 4.114: Least Squares and Normal Equation 


A specific value of x which solves the problem of Theorem 4.112 is obtained by solving 
the equation 

A T Ax = A T y 

Furthermore, there always exists a solution to this system of equations. 


Proof. For x the minimizer of Theorem 4.112, (y — Ax) • Au = 0 for all ieK" and from 
Lemma 4.113, this is the same as saying 

A T (y — Ax) • u — 0 

for all u G M n . This implies 

A T y — A t Ax = 0. 

Therefore, there is a solution to the equation of this corollary, and it solves the minimization 
problem of Theorem 4.112. □ 

Note that x might not be unique but Ax, the closest point of A (M n ) to y is unique as 
was shown in the above argument. 

An important application of Corollary 4.114 is the problem of finding the least squares 
regression line in statistics. Suppose you are given points in the xy plane 

{(^i, 2/i) , (^2, 2/2) , • - - ,{x n ,y n )} 


and you would like to find constants m and b such that the line y = mx + b goes through 
all these points. Of course this will be impossible in general. Therefore, we try to find m, b 
such that the line will be as close as possible. The desired system is 


yi 


X\ 

1 " 








m 






b 

yn 


_ x n 

1 
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which is of the form y = Ax. It is desired to choose m and b to make 


A 

m 


Vi 


— 



b 







Vn 



as small as possible. According to Theorem 4.112 and Corollary 4.114, the best values for 
m and b occur as the solution to 





Vi 


1 

A T A 

m 

b 

= A T 

Vn 

, where A = 

H ... 

s 

h- 1 • • • 


Thus, computing A T A, 


[eie * 2 £r=i x i l 

m 


Ei=l X iVi 

L E<=i®i n J 

b 


1 

M 

Sr 

i 


Solving this system of equations for m and b (using Cramer’s rule for example) yields: 


and 


~(E;E+)(EIU^) + (EIU+i/> 

(£r =1 ^-(£r=i^) 2 

- (E"=i *) EIU *iVi + (Ek 5k x * 

(£r=i^ 2 )^-(£r=i^) 2 


Consider the following example. 



Solution. In this case we have n — 5 data points and we obtain: 


EL x i = 10 E?=i Vi = 14 
Eli x iVi = 38 E?=i X 1 = 30 


and hence 


m 


b 


-10*14 + 5*38 
5 * 30 - 10 2 


1.00 


— 10 * 38 + 14 * 30 

5 * 30 - 10 2 ““ °' 8 ° 
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The least squares regression line for the set of data points is: 

y = x + .8 

One could use this line to approximate other values for the data. For example for x = 6 
one could use y( 6) = 6 + .8 = 6.8 as an approximate value for the data. 

The following diagram shows the data points and the corresponding regression line. 



One could clearly do a least squares fit for curves of the form y = ax 2 + bx + c in the 
same way. In this case you want to solve as well as possible for a, b, and c the system 


xf Xi 

l " 


a 


Vi 

2 

l 


b 

c 

— 

Vn 


and one would use the same technique as above. Many other similar problems are important, 
including many in higher dimensions and they are all solved the same way. 

4.11.6. Exercises 


1. Here are some matrices. Label according to whether they are symmetric, skew sym- 
metric, or orthogonal. 


(a) 


' 1 
0 

0 


0 

1 

y/2 

1 

42 


0 ' 

1 

42 

i 

42 
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(b) 


12-3 
2 1 4 

-3 4 7 

0 -2 -3 

2 0-4 

3 4 0 


2. For U an orthogonal matrix, explain why \\Ux\\ = ||F|| for any vector x. Next explain 
why if U is an n x n matrix with the property that \\Ux\\ = ||F|| for all vectors, x, then 
U must be orthogonal. Thus the orthogonal matrices are exactly those which preserve 
length. 


3. Suppose U is an orthogonal n x n matrix. Explain why rank (U) = n. 

4. Fill in the missing entries to make the matrix orthogonal. 


-1 

-1 

i 

V2 

1 

vT 

V 7 ! 

V2 

Vg 


5. Fill in the missing entries to make the matrix orthogonal. 

V2 1 , 


2 \/2 1/9 

3 2 6 V 

2 

3 " 


6. Fill in the missing entries to make the matrix orthogonal. 

' 1 2 _ 

3 y/5 

! o 

7. Find an orthonormal basis for the span of each of the following sets of vectors. 


3 ' 


7 


' 1 ' 

-4 

9 

-1 

1 

7 

0 


0 


1 

3 ' 


' 11 " 


' 1 " 

0 


0 

1 

1 

-4 


2 


7 

3 ' 


5 " 


" -7 

0 

•) 

0 

1 

1 

-4 


10 


1 
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8. Using the Gram Schmidt process find an orthonormal basis for the following span: 


span 


1 

2 

1 


2 

-1 

3 


1 

0 

0 


9. Using the Gram Schmidt process find an orthonormal basis for the following span: 



10. The set V = 


x 

y 

z 


2x + 3y — z = 0 ) is a subspace of M 3 . Find an orthonormal 

basis for this subspace. 

11. Find the least squares solution to the following system. 


x + 2y = 1 
2x + 3y = 2 
3x + by = 4 


12. You are doing experiments and have obtained the ordered pairs, 

(0,1), (1,2), (2, 3.5), (3, 4) 

Find m and b such that y = nix + b approximates these four points as well as possible. 

13. Suppose you have several ordered triples, (xt, y *, z t ) . Describe how to find a polynomial 
such as 

z = a + bx + cy + dxy + ex 2 + fy 2 
giving the best fit to the given ordered triples. 


4.12 Applications 


Outcomes 


A. Apply the concepts of vectors in M n to the applications of physics and work. 
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4.12.1. Vectors and Physics 


Suppose you push on something. Then, your push is made up of two components, how hard 
you push and the direction you push. This illustrates the concept of force. 


Definition 4.116: Force 


Force is a vector. The magnitude of this vector is a measure of how hard it is pushing. 
It is measured in units such as Newtons or pounds or tons. The direction of this vector 
is the direction in which the push is taking place. 


Vectors are used to model force and other physical vectors like velocity. As with all 
vectors, a vector modeling force has two essential ingredients, its magnitude and its direction. 
Recall the special vectors which point along the coordinate axes. These are given by 

e* = [0 • • • 0 1 0 • • • 0] T 

where the 1 is in the i th slot and there are zeros in all the other spaces. The direction of e* 
is referred to as the i th direction. 

Consider the following picture which illustrates the case of M 3 . Recall that in M 3 , we may 
refer to these vectors as i,j, and k. 


z 



Given a vector u = [u\ ■ ■ -u r i] T , it follows that 

n 

u U\e\ + • • • + u n e n ^ ^ 

k = i 

What does addition of vectors mean physically? Suppose two forces are applied to some 
object. Each of these would be represented by a force vector and the two forces acting 
together would yield an overall force acting on the object which would also be a force vector 
known as the resultant. Suppose the two vectors are u = Ylk=i DC and ^ = Y^l=i DC- Then 
the vector u involves a component in the i th direction given by uV, while the component 
in the i th direction of v is v t e t . Then the vector u + v should have a component in the i th 
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direction equal to (iq + *y) e t . This is exactly what is obtained when the vectors, u and v are 
added. 


U + V = [ui + Vi ■ ■ -u n + v n \ T 
n 

= ^2 ( U i + V i ) Ci 

i = 1 

Thus the addition of vectors according to the rules of addition in M n which were presented 
earlier, yields the appropriate vector which duplicates the cumulative effect of all the vectors 
in the sum. 

Consider now some examples of vector addition. 


Example 4.117: The Resultant of Three Forces 


There are three ropes attached to a car and three people pull on these ropes. The first 
exerts a force of = 2i + 3j — 2k Newtons, the second exerts a force of F 2 = 3* + 5j + k 
Newtons and the third exerts a force of 5 i — j + 2k Newtons. Find the total force in 
the direction of i . 


Solution. To find the total force, we add the vectors as described above. This is given by 

(2 i + 3 j — 2k) + (3 i + 5 j + k) + (5 i — j + 2k) 

= (2 + 3 + 5 )i + (3 + 5 H — 1 )j + ( — 2 + 1 + 2 )k 
= 10* + 7 j + k 

Hence, the total force is 10* + 7j + k Newtons. Therefore, the force in the * direction is 10 
Newtons. □ 

Consider another example. 


Example 4.118: Finding a Vector from Geometric Description 


An airplane flies North East at 100 miles per hour. Write this as a vector. 


Solution. 

A picture of this situation follows. 



Therefore, we need to find the vector u which has length 100 and direction as shown in 
this diagram. We can consider the vector u as the hypotenuse of a right triangle having 
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equal sides, since the direction of u corresponds with the 45° line. The sides, corresponding 
to the i and j directions, should be each of length 100/ \/2. Therefore, the vector is given by 


_ 100 ^ 100 ^ 
u = —=i H i=j = 

y/2 y/2 


100 

V2 


100 

V2 


T 


□ 


This example also motivates the concept of velocity, defined below. 


Definition 4.119: Speed and Velocity 


The speed of an object is a measure of how fast it is going. It is measured in units 
of length per unit time. For example, miles per hour, kilometers per minute, feet 
per second. The velocity is a vector having the speed as the magnitude but also 
specifying the direction. 


Thus the velocity vector in the above example is ^=i + while the speed is 100 miles 
per hour. 

Consider the following example. 


Example 4.120: Position From Velocity and Time 


The velocity of an airplane is lOOi + j + k measured in kilometers per hour and at a 
certain instant of time its position is (1,2,1). 

Find the position of this airplane one minute later. 


Solution. Here imagine a Cartesian coordinate system in which the third component is 
altitude and the first and second components are measured on a line from West to East and 
a line from South to North. 

Consider the vector [ 1 2 1 ] , which is the initial position vector of the airplane. As 
the plane moves, the position vector changes according to the velocity vector. After one 
minute (considered as A of an hour) the airplane has moved in the i direction a distance of 
100 x A = | kilometer. In the j direction it has moved A kilometer during this same time, 
while it moves A kilometer in the k direction. Therefore, the new displacement vector for 


the airplane is 


[1 2 lf+[§ 


60 



8 121 121 l r 

3 60 60 J 


□ 


Now consider an example which involves combining two velocities. 


Example 4.121: Sum of Two Velocities 


A certain river is one half kilometer wide with a current flowing at 4 kilometers per 
hour from East to West. A man swims directly toward the opposite shore from the 
South bank of the river at a speed of 3 kilometers per hour. How far down the river 
does he find himself when he has swam across? How far does he end up swimming? 
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Solution. Consider the following picture which demonstrates the above scenario. 


4 


4 


3 


First we want to know the total time of the swim across the river. The velocity in the 
direction across the river is 3 kilometers per hour, and the river is | kilometer wide. It 
follows the trip takes 1/6 hour or 10 minutes. 

Now, we can compute how far downstream he will end up. Since the river runs at a rate 
of 4 kilometers per hour, and the trip takes 1/6 hour, the distance traveled downstream is 
given by 4 (|) = | kilometers. 

The distance traveled by the swimmer is given by the hypotenuse of a right triangle. The 
two arms of the triangle are given by the distance across the river, |km, and the distance 
traveled downstream, | km. Then, using the Pythagorean Theorem, we can calculate the 
total distance d traveled. 



Therefore, the swimmer travels a total distance of | kilometers. □ 

’ D 


4.12.2. Work 


The mathematical concept of work is an application of vectors in M n . The physical concept 
of work differs from the notion of work employed in ordinary conversation. For example, 
suppose you were to slide a 150 pound weight off a table which is three feet high and shuffle 
along the floor for 50 yards, keeping the height always three feet and then deposit this weight 
on another three foot high table. The physical concept of work would indicate that the force 
exerted by your arms did no work during this project. The reason for this definition is that 
even though your arms exerted considerable force on the weight, the direction of motion was 
at right angles to the force they exerted. The only part of a force which does work in the 
sense of physics is the component of the force in the direction of motion. 

Work is defined to be the magnitude of the component of this force times the distance 
over which it acts, when the component of force points in the direction of motion. In the 
case where the force points in exactly the opposite direction of motion work is given by (— 1) 
times the magnitude of this component times the distance. Thus the work done by a force 
on an object as the object moves from one point to another is a measure of the extent to 
which the force contributes to the motion. This is illustrated in the following picture in the 
case where the given force contributes to the motion. 
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Recall that for any vector u in M n , we can write u as a sum of two vectors, as in 

U = U\\ + u± 

For any force F, we can write this force as the sum of a vector in the direction of the motion 
and a vector perpendicular to the motion. In other words, 

F = F\\ + F± 

In the above picture the force, F is applied to an object which moves on the straight 
line from P to Q. There are two vectors shown, Fj| and F± and the picture is intended to 
indicate that when you add these two vectors you get F. In other words, F = Fj + F±. 
Notice that F| acts in the direction of motion and F± acts perpendicular to the direction of 
motion. Only Fm contributes to the work done by F on the object as it moves from P to Q. 
F| is called the component of the force in the direction of motion. From trigonometry, you 
see the magnitude of F\\ should equal ||F|| |cos 6>| . Thus, since F]| points in the direction of 
the vector from P to Q, the total work done should equal 

||F||||F^|| cos 6 = ||F||||(f — p|| cos 9 

Now, suppose the included angle had been obtuse. Then the work done by the force F 
on the object would have been negative because Fp would point in —1 times the direction of 
the motion. In this case, cos 6 would also be negative and so it is still the case that the work 
done would be given by the above formula. Thus from the geometric description of the dot 
product given above, the work equals 

||F||||g-p]| cos 6 = F»(q-p) 

This explains the following definition. 


Definition 4.122: Work Done on an Object by a Force 


Let F be a force acting on an object which moves from the point P to the point Q, 
which have position vectors given by p and q respectively. Then the work done on 
the object by the given force equals F • (<f — p) . 


Consider the following example. 
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Example 4.123: Finding Work 


Let F — [ 2 7 —3 ] T Newtons. Find the work done by this force in moving from 
the point (1, 2, 3) to the point (—9, —3, 4) along the straight line segment joining these 
points where distances are measured in meters. 


Solution. First, compute the vector q — p , given by 

[-9 -3 4 ] T — [ 1 2 3 ] T = [ —10 -5 1 ] T 
According to Definition 4.122 the work done is 

[2 7 3 ] T • [ —10 -5 1 ] T = -20 + (-35) + (-3) 

= —58 Newton meters 


□ 

Note that if the force had been given in pounds and the distance had been given in feet, 
the units on the work would have been foot pounds. In general, work has units equal to 
units of a force times units of a length. Recall that 1 Newton meter is equal to 1 Joule. Also 
notice that the work done by the force can be negative as in the above example. 


4.12.3. Exercises 


1. The wind blows from the South at 20 kilometers per hour and an airplane which flies 
at 600 kilometers per hour in still air is heading East. Find the velocity of the airplane 
and its location after two hours. 


2. The wind blows from the West at 30 kilometers per hour and an airplane which flies 
at 400 kilometers per hour in still air is heading North East. Find the velocity of the 
airplane and its position after two hours. 


3. The wind blows from the North at 10 kilometers per hour. An airplane which flies at 
300 kilometers per hour in still air is supposed to go to the point whose coordinates 
are at (100, 100) . In what direction should the airplane fly? 


4. Three forces act on an object. Two are 

3 ' 

-1 

and 

1 ' 

-3 

force if the object is not to move. 

-1 


4 


Newtons. Find the third 


5. 



6 ' 


" 2 ' 

Three forces act on an object. Two are 

-3 

and 

1 


3 


3 


force if the total force on the object is to be 


7 

1 

3 


Newtons. Find the third 
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6. A river flows West at the rate of b miles per hour. A boat can move at the rate of 8 
miles per hour. Find the smallest value of b such that it is not possible for the boat to 
proceed directly across the river. 

7. The wind blows from West to East at a speed of 50 miles per hour and an airplane 
which travels at 400 miles per hour in still air is heading North West. What is the 
velocity of the airplane relative to the ground? What is the component of this velocity 
in the direction North? 

8. The wind blows from West to East at a speed of 60 miles per hour and an airplane 
can travel travels at 100 miles per hour in still air. How many degrees West of North 
should the airplane head in order to travel exactly North? 

9. The wind blows from West to East at a speed of 50 miles per hour and an airplane 
which travels at 400 miles per hour in still air heading somewhat West of North so that, 
with the wind, it is flying due North. It uses 30.0 gallons of gas every hour. If it has 
to travel 600.0 miles due North, how much gas will it use in flying to its destination? 

10. An airplane is flying due north at 500.0 miles per hour but it is not actually going due 
North because there is a wind which is pushing the airplane due east at 40.0 miles per 
hour. After one hour, the plane starts flying 30° East of North. Assuming the plane 
starts at (0,0), where is it after 2 hours? Let North be the direction of the positive y 
axis and let East be the direction of the positive x axis. 

11. City A is located at the origin (0, 0) while city B is located at (300, 500) where distances 
are in miles. An airplane flies at 250 miles per hour in still air. This airplane wants 
to fly from city A to city B but the wind is blowing in the direction of the positive y 
axis at a speed of 50 miles per hour. Find a unit vector such that if the plane heads 
in this direction, it will end up at city B having flown the shortest possible distance. 
How long will it take to get there? 

12. A certain river is one half mile wide with a current flowing at 3.0 miles per hour from 
East to West. A man takes a boat directly toward the opposite shore from the South 
bank of the river at a speed of 5.0 miles per hour. How far down the river does he find 
himself when he has swam across? How far does he end up traveling? 

13. A certain river is one half mile wide with a current flowing at 2 miles per hour from 
East to West. A man can swim at 3 miles per hour in still water. In what direction 
should he swim in order to travel directly across the river? What would the answer to 
this problem be if the river flowed at 3 miles per hour and the man could swim only 
at the rate of 2 miles per hour? 

14. Three forces are applied to a point which does not move. Two of the forces are 
2 i + 2 j — 6 k Newtons and 8 i + 8 j + 3 k Newtons. Find the third force. 
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15. The total force acting on an object is to be 4i + 2j — 3k Newtons. A force of —3i—lj+8k 
Newtons is being applied. What other force should be applied to achieve the desired 
total force? 

16. A bird flies from its nest 8 km in the direction |7T north of east where it stops to rest 
on a tree. It then flies 1 km in the direction due southeast and lands atop a telephone 
pole. Place an xy coordinate system so that the origin is the bird’s nest, and the 
positive x axis points east and the positive y axis points north. Find the displacement 
vector from the nest to the telephone pole. 

17. If F is a force and D is a vector, show proj f y (^F^j = ^||F|| cos6^ u where u is the unit 

vector in the direction of D, where u = H/||Z1|| and 6 is the included angle between 
the two vectors, F and D. ||F|| cos 9 is sometimes called the component of the force, 
F in the direction, D. 

18. A boy drags a sled for 100 feet along the ground by pulling on a rope which is 20 
degrees from the horizontal with a force of 40 pounds. How much work does this force 
do? 

19. A girl drags a sled for 200 feet along the ground by pulling on a rope which is 30 
degrees from the horizontal with a force of 20 pounds. How much work does this force 
do? 

20. A large dog drags a sled for 300 feet along the ground by pulling on a rope which is 45 
degrees from the horizontal with a force of 20 pounds. How much work does this force 
do? 

21. How much work does it take to slide a crate 20 meters along a loading dock by pulling 
on it with a 200 Newton force at an angle of 30° from the horizontal? Express your 
answer in Newton meters. 

22. An object moves 10 meters in the direction of j. There are two forces acting on this 
object, F\ = i + j + 2k, and F 2 = —5 i + 2 j — 6k. Find the total work done on the 
object by the two forces. Hint: You can take the work done by the resultant of the 
two forces or you can add the work done by each force. Why? 

23. An object moves 10 meters in the direction of j + i. There are two forces acting on 
this object, F\ — i + 2 j + 2k, and F 2 = 5i + 2 j — 6k. Find the total work done on the 
object by the two forces. Hint: You can take the work done by the resultant of the 
two forces or you can add the work done by each force. Why? 

24. An object moves 20 meters in the direction of k + j. There are two forces acting on 
this object, F\ = i + j + 2k, and F 2 = i + 2j — 6k. Find the total work done on the 
object by the two forces. Hint: You can take the work done by the resultant of the 
two forces or you can add the work done by each force. 
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5. Linear Transformations 


5.1 Linear Transformations 


Outcomes 


A. Understand the definition of a linear transformation, and that all linear trans- 
formations are determined by matrix multiplication. 


Recall that when we multiply an m x n matrix by an n x 1 column vector, the result is 
anmxl column vector. In this section we will discuss how, through matrix multiplication, 
an m x n matrix transforms an n X 1 column vector into anmxl column vector. 

Recall that the n x 1 vector given by 


x 1 

x 2 


x 


n 


is said to belong to M n , which is the set of all nxl vectors. In this section, we will discuss 
transformations of vectors in M n . 

Consider the following example. 


Example 5.1: A Function Which Transforms Vectors 

Consider the matrix A = 

forms vectors in M 3 into ve 

'12 0' 
2 10 
ctors in M 2 

. Show that by matrix multiplication A trans- 


Solution. First, recall that vectors in M 3 are vectors of size 3x1, while vectors in M 2 are of 
size 2x1. If we multiply A, which is a 2 x 3 matrix, by a 3 x 1 vector, the result will be a 
2x1 vector. This what we mean when we say that A transforms vectors. 
x 


Now, for 


y 


in M 3 , multiply on the left by the given matrix to obtain the new vector. 


z 
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This product looks like 


'12 0' 


X 


x + 2y 

2 10 


y 

— 

2 x + y 

‘ 


z 




The resulting product is a 2 x 1 vector which is determined by the choice of x and y. Here 
are some numerical examples. 




" 1 ' 


r 

L 

' 1 2 

0 




5 

2 1 

0 


2 

— 

4 




3 




Here, the vector 


1 

2 

3 


in M 3 was transformed by the matrix into the vector 


Here is another example: 


5 

4 


in M 2 . 


12 0 
2 10 


10 

5 

-3 


20 

25 


□ 


The idea is to define a function which takes vectors in M 3 and delivers new vectors in M 2 . 
In this case, that function is multiplication by the matrix A. 

Let T denote such a function. The notation T : M n i — y M m means that the function T 
transforms vectors in into vectors in M m . The notation T(x) means the transformation 
T applied to the vector x. The above example demonstrated a transformation achieved by 
matrix multiplication. In this case, we often write 

T a (x) = Ax 

Therefore, T A is the transformation determined by the matrix A. In this case we say that T 
is a matrix transformation. 

Recall the properties of matrix multiplication. The pertinent property here is 2.6 which 
states that for k and p scalars, 


A ( kB + pC ) = kAB + pAC 

In particular, for A an m x n matrix and B and C, n x 1 vectors in M n , this formula holds. 

In other words, this means that matrix multiplication gives an example of a linear trans- 
formation, which we will now define. 
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We began this section by discussing matrix transformations, where multiplication by a 
matrix transforms vectors. These matrix transformations are in fact linear transformations. 


Theorem 5.3: Matrix Transformations are Linear Transformation 


Let T : M n i — y M m be a transformation defined by T(x) = Ax. Then T is a linear 
transformation . 


ft turns out that every linear transformation can be expressed as a matrix transformation, 
and thus linear transformations are exactly the same as matrix transformations. This will 
be the content of the next section. 


5.1.1. Exercises 

1. Show the map T : M n i — > W 71 defined by T (x) = Ax where A is an m x n matrix and 
x is an m x 1 column vector is a linear transformation. 

2. Show that the function Tyy defined by T$ (w) — w — proj# (w) is also a linear transfor- 
mation. 

3. Let u be a fixed vector. The function T$ defined by Tgv = u + v has the effect of 
translating all vectors by adding u ^ 0. Show this is not a linear transformation. 
Explain why it is not possible to represent Tyy in M 3 by multiplying by a 3 x 3 matrix. 

5.2 The Matrix of a Linear 
Transformation 


Outcomes 


A. Find the matrix of a linear transformation and determine the action on a vector 
in M n . 


In the above examples, the action of the linear transformations was to multiply by a 
matrix. It turns out that this is always the case for linear transformations. If T is any linear 
transformation which maps to M m , there is always an m x n matrix A with the property 
that 


T (x) = Ax 


(5.1) 


for all x E M”. 
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Theorem 5.4: Matrix of a Linear Transformation 


Let T : M n i — y be a linear transformation. Then we can find a matrix A such that 

T{x) = Ax. In this case, we say that T is determined or induced by the matrix A. 


Here is why. Suppose T : t — y M m is a linear transformation and you want to find the 

matrix defined by this linear transformation as described in 5.1. Note that 



X\ 


" 1 " 


' 0 " 



' 0 ' 


X2 


0 


1 



0 

X = 


= Xi 

0 

+ x 2 

0 

+ • 

• + X n 

1 


= ^ x i e i 

x n 0 0 1 

where e) is the i th column of I n , that is the n x 1 vector which has zeros in every slot but 
the i th and a 1 in this slot. 

Then since T is linear, 

n 

T(x) = T XjT (ej) 

1=1 


T (el) 


1 ■ 


X\ 

Tn) 



1 - 


. Xn . 


= A 


X\ 


X r . 


Therefore, the desired matrix is obtained from constructing the i th column as T (e)) . We 
state this formally as the following theorem. 



The following Corollary is an essential result. 


Corollary 5.6: Matrix and Linear Transformation 


A transformation T is a linear transformation if and only if it is a matrix transforma- 
tion. 
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Consider the following example. 


Example 5.7: The Matrix of a Linear Transformation 


Suppose T is a linear transformation, T : M 3 — > M 2 where 


■ 1 ' 




" 0 ' 




" 0 ' 




' 1 ' 

, T 


9 



' 1 ' 

0 

— 


1 

— 

, T 

0 

— 



2 


0 


-3 



1 

0 







1 




Find the matrix A of T such that T (x) = Ax for all x. 


Solution. By Theorem 5.5 we construct A as follows: 


A 


Tie i) ••• T(e n ) 


In this case, A will be a 2 x 3 matrix, so we need to find T (e\) ,T (e 2 ) , and T (e 3 ). 
Luckily, we have been given these values so we can fill in A as needed, using these vectors 
as the columns of A. Hence, 


□ 


In this example, we were given the resulting vectors of T (ei) , T (e 2 ) , and T (e 3 ). Con- 
structing the matrix A was simple, as we could simply use these vectors as the columns of 

A. 

The next example shows how to find A when we are not given the T (e t ) so clearly. 


Example 5.8: The Matrix of Linear Transformation: Inconveniently 
Defined 


Suppose T is known to be a linear transformation, T : M 2 — > M 2 and 


T 


1 

2 


T 


0 

-1 


3 

2 


Find the matrix A of the transformation T . 


Solution. By Theorem 5.5 to find this matrix, we need to determine the action of T on e) 
and e 2 . In Example 5.7, we were given these resulting vectors. However, in this example, we 
have been given T of two different vectors. How can we find out the action of T on ef and 
e 2 ? In particular for ei, suppose there exist x and y such that 


1 

0 



(5.2) 
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Then, since T is linear, 


T 


1 

0 



+ yT 



Substituting in values, this sum becomes 


T 


1 

0 



' 1 ' 


' 3 ' 

X 

2 

+ y 

2 


(5.3) 


Therefore, if we know the values of x and y which satisfy 5.2, we can substitute these 
into equation 5.3. By doing so, we find T (e\) which is the first column of the matrix A. 

We proceed to find x and y. We do so by solving 5.2, which can be done by solving the 
system 

x = 1 
x — y = 0 

We see that x — 1 and y — 1 is the solution to this system. Substituting these values 
into equation 5.3, we have 



Therefore 


4 

4 


is the first column of A. 


Computing the second column is done in the same way, and is left as an exercise. 
The resulting matrix A is given by 


A = 


4 

4 


-3 

-2 


□ 


This example illustrates a very long procedure for finding the matrix of A. While this 
method is reliable and will always result in the correct matrix A, the following procedure 
provides an alternative method. 

Recall that 

\ A\ • ■ ■ A n ] 

denotes a matrix which has for its i th column the vector Ai. 


Procedure 5.9: Finding the Matrix of Inconveniently Defined Linear Trans- 
formation 


Suppose T : M n — » M m is a linear transformation. Suppose there exist vectors 
{ai, • • • , a n j in M n such that [ d\ ■ ■ ■ a n ] exists, and 

T ( a* ) = bi 

Then the matrix of T must he of the form 

[ ] 
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We will illustrate this procedure in the following example. You may also find it useful to 
work through Example 5.8 using this procedure. 


Example 5.10: Matrix of a Linear Transformation 
Given Inconveniently 


Suppose T : M 3 — y M 3 is a linear transformation and 



1 


0 


0 


2 


1 


0 

T 

3 

= 

1 

,T 

1 

= 

1 

,T 

1 

= 

0 


1 


1 


1 


3 


0 


1 


Find the matrix of this linear transformation. 



'10 1' 

-l 

1 

o 

CN 

o 

Solution. By Procedure 5.9, A = 

3 1 1 

1 1 0 

and B = 

1 1 0 

1 3 1 


Then, Procedure 5.9 claims that the matrix of T is 


C = BA- 1 


2-2 4 
0 0 1 
4-3 6 


Indeed you can first verify that T(x) = Cx for the 3 vectors above: 


2 

-2 

1 


1 


1 

o 


" 2 

-2 

1 


1 

o 

1 


" 2 ' 

0 

0 

1 


3 

= 

1 

1 

0 

0 

1 


1 

= 

1 

4 

1 

CO 

1 

CO 







1 

CO 

1 

co 




1 

CO 

1 


2-24' 


1 


0 0 1 


1 

= 

4-3 6 


1 

o 



But more generally T(x) = Cx for any x. To see this, let y — A 1 x and then using 
linearity of T : 


T(x)=T(Ay)=T 


i^i 


^ yiT{di ) y^i = By = BA 1 x = Cx 


□ 


Recall the dot product discussed earlier. Consider the map ve-> proj^(u). It turns out 
that this map is linear, a result which follows from the properties of the dot product. This 
is shown as follows. 


proj^ (kv + pw ) 


( kv + pw) • u 


u • u 


u 


k 


V • u 


u • u 


u + p 


w • u 
u •u 


u 


k P r oj^ (v) + p proj^ (w) 
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Consider the following example. 



Solution. 


1. First, we have just seen that T(v) = proj^ (v) is linear. Therefore by Theorem 5.4, we 
can find a matrix A such that T{x) = Ax. 

2. The columns of the matrix for T are defined above as T(e*j). It follows that T(e*i) = 
proj^ (e*j) gives the i th column of the desired matrix. Therefore, we need to find 


proj^l ei) = 


u 


u • u 


u 


For the given vector u , this implies the columns of the desired matrix are 


1 

14 


1 

1 

2 

1 

1 

3 

1 

1 

2 

’14 

2 

’14 

2 

3 

3 

3 


which you can verify using Definition 4.37. Hence the matrix of T is 

1 [ 1 2 3 ' 

— 2 4 6 

14 3 6 9 


□ 


5.2.1. Exercises 

1. Consider the following functions which map M n to M n . 

(a) T multiplies the j th component of x by a nonzero number b. 

(b) T replaces the i th component of x with b times the j th component added to the 
i th component. 
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(c) T switches the i th and j th components. 


Show these functions are linear transformations and describe their matrices A such 
that T (x) = Ax. 

2. You are given a linear transformation T : M” — > M m and you know that 

T(A i ) = B i 

where \ Ai ■ ■ ■ A n ] 1 exists. Show that the matrix of T is of the form 

[B 1 ••• B n ][A 1 ••• ] _1 

3. Suppose T is a linear transformation such that 



1 ' 


' 5 ' 

T 

2 

= 

1 


-6 


3 


" -1 ' 


' 1 ' 

T 

-1 

= 

1 


5 


5 


0 ' 


5 

T 

-1 

= 

3 


2 


-2 


Find the matrix of T . That is find A such that T(x) = Ax. 

4. Suppose T is a linear transformation such that 



1 ' 


' 1 ' 

T 

1 

= 

3 


-8 


1 


" -1 ' 


" 2 " 

T 

0 

= 

4 


6 


1 


0 ' 


6 

T 

-1 

= 

1 


3 


-1 


Find the matrix of T . That is find A such that T(x) = Ax. 
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5. Suppose T is a linear transformation such that 


1 1 r -3 ' 

T 3 1 

-7 J [ 3 

■ -i i r i ' 

T -2 3 

6 J L - 3 

0 1 [ 5 ' 

T -1 3 

2 J L - 3 

Find the matrix of T. That is find A such that T(x) = Ax. 

6. Suppose T is a linear transformation such that 

1 1 I" 3 " 

T 1 = 3 

-7 J [ 3 



Find the matrix of T. That is find A such that T(x) = Ax. 

7. Suppose T is a linear transformation such that 


1 1 r 5 " 

T 2 = 2 

- 18 J [ 5 



0 1 I" 2 ' 

T -1 5 

4 J L - 2 


Find the matrix of T. That is find A such that T(x) = Ax. 

8. Consider the following functions T : M 3 — > M 2 . Show that each is a linear transforma- 
tion and determine for each the matrix A such that T(x) = Ax. 
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X 

r 

(a) T 

y 

= 


z 

- 


X 


(b) T 

y 

= 


z 

L 


X 

r 

(c) T 

y 

= 


z 

- 


X 


(d) T 

y 

= 


z 



x + 2y + 3z 
2y — 3x + z 


7x + 2 y + z 
3x — 11 y + 2z 


3x + 2 y + z 
x + 2y + 6z 


2y — 5x + z 
x + y + z 


9. Consider the following functions T : M 3 — y M 2 . Explain why each of these functions T 
is not linear. 



X 

r 

(a) T 

y 

= 


z 

- 


X 


(b) T 

y 

= 


z 



X 


(c) T 

y 

= 


z 

- 


X 


(d) T 

y 

= 


z 



x + 2y + 3z + 1 
2y — 3x + z 

x + 2 y 2 + 3z 
2y + 3x + z 

sin x + 2y + 3z 
2y + 3x + z 

x + 2y + 3z 
2y + 3x — In z 


10. Suppose 

[ ■ ■ ■ A n ] 


exists where each Aj e M n and let vectors {Bi, - ■ ■ , B n } in M m be given. Show that 
there always exists a linear transformation T such that T(Ai) = Bi. 


11. Find the matrix for T (w) = proj, ? (w) where v — [ 1 —2 3 

12. Find the matrix for T ( w ) = proj^(tiJ) where v — [ 1 5 3 ] T . 

13. Find the matrix for T ( w ) = proj^(tiJ) where v — [ 1 0 3 ] T . 


249 



5.3 Properties of Linear Transformations 


Outcomes 


A. Find the composite of transformations and the inverse of a transformation. 

Let T : M n K y M m be a linear transformation. Then there are some important properties 
of T which will be examined in this section. Consider the following theorem. 


Theorem 5.12: Properties of Linear Transformations 


Let T : M n i — > R m be a linear transformation and let x G R n . 

• T(0x) = 0 T(x). Hence T( 0) = 0. (T preserves the zero vector) 

• T((—l)x) = (— l)T(x). Hence T(—x) = —T(x). (T preserves the negative of a 
vector) 

• Let xi, ...,Xk £ an d ai, ...,Ofc G M. Then if y = a^Xi + a 2 x 2 + ... + a-kXk, it 
follows that T(y) = T(aiXi+a 2 x 2 + ...+a k x k ) = aiT(xi)+a 2 T(x 2 ) + ... + akT(xk). 
(T preserves linear combinations). 


Consider the following example. 


Example 5.13: Linear Combination 


Let T : M 3 t— y M 4 be a linear transformation such that 


Find T 


-7 

3 

-9 


T 


1 

3 

1 


4 

4 

0 

-2 


,T 


4 
0 

5 


4 

5 
-1 
5 


Solution. Using the third property in Theorem 5.12, we can find T 


7 ' 


' 1 ' 


" 4 ' 

3 

as a linear combination of 

3 

and 

0 

9 


1 


5 


-7 

3 

-9 


by writing 


250 


Therefore we want to find o,6gK such that 


' -7 ' 


' 1 ' 


" 4 ' 

3 

= a 

3 

+ b 

0 

-9 


1 


5 


The necessary augmented matrix and resulting reduced row-echelon form are given by: 


1 4 

-7 ' 


1 

o 

1 ' 

3 0 

3 


0 1 

-2 

1 5 

-9 


o 

o 

0 


Hence a — 1, b — — 2 and 


1 

1 

1 


' 1 ' 



1 

1 

CO o 

1 

1 

= 1 

3 

1 

+ ( 

:-2) 

0 

5 


Now, using the third property above, we have 


T 


-7 

3 

-9 


= T 


+ (- 2 ) 


2 T 


4 
0 

5 

4 

5 

-1 

5 


4 
0 

5 


Therefore, T 



□ 


Suppose two linear transformations act on the same vector x, first the transformation T 
and then a second transformation given by S. We can find the composite transformation 
that results from applying both transformations. 
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Definition 5.14: Composition of Linear Transformations 


Let T : i — > M n and S : M n i — y M m he linear transformations. Then the composite 

of S and T is 

S o T : R k 

The action of S oT is given by 

(. S o T)(x) = S(T(x)) for all x e R k 


Notice that the resulting vector will be in Be careful to observe the order of trans- 
formations. We write S o T but apply the transformation T first, followed by S. 


Theorem 5.15: Composition of Transformations 


Let T : i— y M n and S : M n t— y be linear transformations such that T is induced by 

the matrix A and S is induced by the matrix B. Then S o T is a linear transformation 
which is induced by the matrix BA. 


Consider the following example. 



Solution. By Theorem 5.15, the matrix of S o T is given by BA. 


' 23 ' 


i 

Y—l 

to 


i 

00 

0 1 


2 0 


2 0 


To find (S o T)(x), multiply x by BA as follows 


' 8 

4 ' 


' 1 ' 


' 24 ' 

2 

0 


4 


2 
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To check, first determine T(x): 


CM 

t— H 

1 


' 1 ' 


' 9 ' 

2 0 


l 


l 

CN 


Then, compute S(T(x )) as follows: 


' 2 

3 ' 


' 9 ' 


' 24 ' 

0 

1 


2 


2 


□ 

Consider a composite transformation S o T, and suppose that this transformation acted 
such that (S o T)(x) = x. That is, the transformation S took the vector T{x) and returned 
it to x. In this case, S and T are inverses of each other. Consider the following definition. 



The following theorem is crucial, as it claims that the above inverse transformations are 
unique. 


Theorem 5.18: Inverse of a Transformation 


Let T : M n t — y M n be a linear transformation induced by the matrix A. Then T has 
an inverse transformation if and only if the matrix A is invertible. In this case, the 
inverse transformation is unique and denoted T~ l : M n i— >• M n . T _1 is induced by the 
matrix A~ l . 


Consider the following example. 


Example 5.19: Inverse of a Transformation 


Let T : M 2 i — > M 2 be a linear transformation induced by the matrix 


A = 


2 3 

3 4 


Show that T 1 exists and find the matrix B which it is induced by. 
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Solution. Since the matrix A is invertible, it follows that the transformation T is invertible. 
Therefore, T~ l exists. 

You can verify that A -1 is given by: 



Therefore the linear transformation T 1 is induced by the matrix A 1 . □ 


5.3.1. Exercises 


1. Show that if a function T : — > W 71 is linear, then it is always the case that 

t ( ifi = d. 


2. Let T be a linear transformation induced by the matrix A = 


3 1 
-1 2 


and S a linear 


transformation induced by B = 
for x = 


0 -2 
4 2 


Find matrix of S o T and find (S o T) (f) 


2 

-1 


3. Let T be a linear transformation and suppose T 


1 

-4 


a linear transformation induced by the matrix B = 


1 2 
-1 3 


2 

. Suppose S is 

o 

. Find ( S o T ) ( x ) for 


x = 


1 

-4 


4. Let T be a linear transformation induced by the matrix A = 


2 3 
1 1 


and S a linear 


transformation induced by B = 
for x = 


-1 3 

1 -2 


Find matrix of S'oT and find (S o T) (T) 


5 

6 


5. Let T be a linear transformation induced by the matrix A = 
of T~\ 


2 1 
5 2 


. Find the matrix 


6. Let T be a linear transformation induced by the matrix A = 
matrix of T~ l . 


4 -3 
2 -2 


Find the 
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7. Let T be a linear transformation and suppose T 
—4 1 

„ . Find the matrix of T~ 1 . 


1 

2 



5.4 Special Linear Transformations in IR 2 


Outcomes 


A. Find the matrix of rotations and reflections in M 2 and determine the action of 
each on a vector in M 2 . 


In this section, we will examine some special examples of linear transformations in IR 2 
including rotations and reflections. We will use the geometric descriptions of vector addition 
and scalar multiplication discussed earlier to show that a rotation of vectors through an 
angle and reflection of a vector across a line are examples of linear transformations. 

First, consider the rotation of a vector through an angle. Such a rotation would achieve 
something like the following if applied to each vector from (0, 0) to the point in the picture 
corresponding to the person shown standing upright. 



More generally, denote a transformation given by a rotation by T. Why is such a trans- 
formation linear? Consider the following picture which illustrates a rotation. Let u, v denote 
vectors. 
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Let’s consider how to obtain T (u + v). Simply, you add T(u) and T(v). Here is why. 
If you add T(u) to T(v) you get the diagonal of the parallelogram determined by T(u) and 
T (v ) , as this action is our usual vector addition. Now, suppose we first add u and v, and 
then apply the transformation T to u + v. Hence, we find T(u+v). As shown in the diagram, 
this will result in the same vector. In other words, T(u + v) = T(u ) + T(v). 

This is because the rotation preserves all angles between the vectors as well as their 
lengths. In particular, it preserves the shape of this parallelogram. Thus both T (u) + T (v) 
and T (u + v) give the same vector. It follows that T distributes across addition of the 
vectors of M 2 . 

Similarly, if k is a scalar, it follows that T ( ku ) = kT (u). Thus rotations are an example 
of a linear transformation by Definition 5.2. 

The following theorem gives the matrix of a linear transformation which rotates all vectors 
through an angle of 6. 


r ■ i 

Theorem 5.20: Rotation 

Let Rq : M 2 — * M 2 be a linear t 
angle of 9. Then the matrix A oj 

ransformation giver. 
f Re is given by 

cos ( 6 ) — sin ( 9 ) 
sin ( 9 ) cos ( 9 ) 

i by rotating vectors through an 


Proof. Let e± = 


' 1 ' 
0 

and e *2 = 

' 0 ' 
1 

x axis and positive y 


. These identify the geometric vectors which point 
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From Theorem 5.5, we need to find Rg{e i) and Rg(e 2 ), and use these as the columns of 
the matrix A of T. We can use cos, sin of the angle 9 to find the coordinates of Rg(e i) as 
shown in the above picture. The coordinates of Rg{& 2 ) also follow from trigonometry. Thus 


Re{e 1 ) 


cos 9 
sin 9 


, Rg(e 2 ) 


— sin 9 
cos 9 


Therefore, from Theorem 5.5, 

A 


cos 9 — sin 9 
sin 9 cos 9 


We can also prove this algebraically without the use of the above picture. The definition 
of (cos ( 9 ) , sin (9)) is as the coordinates of the point of Rg(e 1 ). Now the point of the vector e 2 
is exactly 7 t/ 2 further along the unit circle from the point of e/, and therefore after rotation 
through an angle of 9 the coordinates x and y of the point of Rg(e 2 ) are given by 


(x, y) = (cos (9 + 7t/ 2) , sin ( 9 + vr/2)) = (— sin 9 , cos 9) 


□ 


Consider the following example. 


Example 5.21: Rotation in M 2 

Let Re : M 2 — > M 2 
2 

Re (x) where x = 

denot 
1 ' 
-2 

:e rotation through 7 t / 2 . Find the matrix of Ril. Then, hnd 


Solution. By Theorem 5.20, the matrix of Ril is given by 


cos (9) 

— sin (9) 


cos (7 t/2) 

— sin {jt/2) 


' 0 

-1 ' 

sin (9) 

cos (9) 


sin (7 t/2) 

cos (vr/2) 


1 

0 


To find R*(x), we multiply the matrix of Ril by x as follows 


1 

O 

I 


1 


' 2 ' 

l 

O 


-2 


1 

l 


□ 

We now look at an example of a linear transformation involving two angles. 


Example 5.22: The Rotation Matrix of the Sum of Two Angles 


Find the matrix of the linear transformation which is obtained by first rotating all 
vectors through an angle of </> and then through an angle 9. Hence the linear transfor- 
mation rotates all vectors through an angle of 9 + </>. 
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Solution. Let Rg+<p denote the linear transformation which rotates every vector through an 
angle of 0 + 0 . Then to obtain we first apply and then Rg where R $ is the linear 

transformation which rotates through an angle of 0 and Rg is the linear transformation which 
rotates through an angle of 0. Denoting the corresponding matrices by Ag +( j )1 A^, and Ag, it 
follows that for every u 


Rg+<t> (^) Ag+^u AgA^u Rg R^ (w) 


Notice the order of the matrices here! 
Consequently, you must have 


Ag +< P 


cos (0 + 0) — sin ( 6 + 0) 
sin (0 + 0) cos (0 + 0) 


cos 0 

— sin 0 


COS 0 

— sin 0 

sin 0 

COS0 


sin 0 

COS 0 


The usual matrix multiplication yields 


Ag +< p 


cos (0 + 0) — sin (0 + 0) 
sin (0 + 0) cos (0 + 0) 


cos 0 cos 0 — sin 0 sin 0 
sin 0 cos 0 + cos 0 sin 0 
AgA ^ 


— cos 0 sin 0 — sin 0 cos 0 
cos 0 cos 0 — sin 0 sin 0 


Don’t these look familiar? They are the usual trigonometric identities for the sum of two 
angles derived here using linear algebra concepts. 

□ 


Here we have focused on rotations in two dimensions. However, you can consider rota- 
tions and other geometric concepts in any number of dimensions. This is one of the major 
advantages of linear algebra. You can break down a difficult geometrical procedure into small 
steps, each corresponding to multiplication by an appropriate matrix. Then by multiplying 
the matrices, you can obtain a single matrix which can give you numerical information on 
the results of applying the given sequence of simple procedures. 

Linear transformations which reflect vectors across a line are a second important type of 
transformations in M 2 . Consider the following theorem. 


r . i 

Theorem 5.23: Reflection 

Let Q m : M 2 — » M 2 be a linear transfc 
y = mx. Then the matrix of Q m is g. 

1 

>rmation given by i 
iven by 

1 — m 2 2 m 

2m m 2 — 1 

'eflecting vectors over the line 

1 + m 2 


Consider the following example. 
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Example 5.24: Reflection in M 2 

Let Q 2 : M 2 — >■ M 2 denote reflection over the line y = 2x. Then 

transformation. Find the matrix of Q 2 . Then, find Q 2 (F) where x = 

1 O 

M H k 

s a linear 


Solution. By Theorem 5.23, the matrix of Q 2 is given by 


1 

1 — m 2 

2 m 

1 

r 1 - (2) 2 

2(2) 

1 

l 

00 

CO 

1 

1 + m 2 

2m 

m 2 — 1 

1 + (2)2 

2(2) 

(2) 2 - 1 J 

5 

8 3 


To find Q- 2 (x) we multiply x by the matrix of Q 2 as follows: 

-3 8 


1 

-2 


19 

5 

2 

5 


□ 

Consider the following example which incorporates a reflection as well as a rotation of 
vectors. 


Example 5.25: Rotation Followed by a Reflection 


Find the matrix of the linear transformation which is obtained by first rotating all 
vectors through an angle of 7t/6 and then reflecting through the x axis. 


Solution. By Theorem 5.20, the matrix of the transformation which involves rotating through 
an angle of 7 t/ 6 is 


cos(7t/ 6) — sin (71/6) 


' iV3 ' 

sin(7r/6) cos(7t/6) 


\ 


Reflecting across the x axis is the same action as reflecting vectors over the line y = rnx 
with m = 0. By Theorem 5.23, the matrix for the transformation which reflects all vectors 
through the x axis is 


1 

1 — m 2 

2m 

1 

r 1 - (0) 2 

2(0) 


1 

0 

1 

1 + m 2 

2m 

m 2 — 1 

" 1 + (0) 2 

2(0) 

(0) 2 - 1 J 


1 

0 

! 

1 


Therefore, the matrix of the linear transformation which first rotates through 7 t/ 6 and 


then reflects through the x axis is 

given by 



1 ° 1 

0 -1 J 

" ' 

= 

■ ?V3 -r 
-i -iV3 


□ 
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5.4.1. Exercises 


1. Find the matrix for the linear transformation which rotates every vector in M 2 through 
an angle of tt/3. 

2. Find the matrix for the linear transformation which rotates every vector in M 2 through 
an angle of 7t/4. 

3. Find the matrix for the linear transformation which rotates every vector in M 2 through 
an angle of — 7t/3. 

4. Find the matrix for the linear transformation which rotates every vector in M 2 through 
an angle of 2tt/3. 

5. Find the matrix for the linear transformation which rotates every vector in M 2 through 
an angle of 7t/12. Hint: Note that 7t/12 = n/3 — n/4. 

6. Find the matrix for the linear transformation which rotates every vector in M 2 through 
an angle of 2tt/3 and then reflects across the x axis. 

7. Find the matrix for the linear transformation which rotates every vector in M 2 through 
an angle of 7t/3 and then reflects across the x axis. 

8. Find the matrix for the linear transformation which rotates every vector in M 2 through 
an angle of 7t/4 and then reflects across the x axis. 

9. Find the matrix for the linear transformation which rotates every vector in M 2 through 
an angle of n/6 and then reflects across the x axis followed by a reflection across the 
y axis. 

10. Find the matrix for the linear transformation which reflects every vector in M 2 across 
the x axis and then rotates every vector through an angle of 7t/4. 

11. Find the matrix for the linear transformation which reflects every vector in M 2 across 
the y axis and then rotates every vector through an angle of 7t/4. 

12. Find the matrix for the linear transformation which reflects every vector in M 2 across 
the x axis and then rotates every vector through an angle of 7 t/6. 

13. Find the matrix for the linear transformation which reflects every vector in M 2 across 
the y axis and then rotates every vector through an angle of 7 t/6. 

14. Find the matrix for the linear transformation which rotates every vector in M 2 through 
an angle of 57t/12. Hint: Note that 57 t/12 = 27t/3 — 7t/4. 

15. Find the matrix of the linear transformation which rotates every vector in M 3 counter 
clockwise about the z axis when viewed from the positive z axis through an angle of 
30° and then reflects through the xy plane. 
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16. Let u = 


a 

b 


be a unit vector in M 2 . Find the matrix which reflects all vectors across 


this vector, as shown in the following picture. 


Hint: Notice that 
through the x axis. 



u 


a 

b 


cos 6 
sin 6 


for some 9. First 
Finally rotate through 6. 


rotate through —9. Next reflect 


5.5 Linear Transformations which are One 
To One or Onto 


Outcomes 


A. Determine if a linear transformation is onto or one to one. 


Let T : W 1 K > M m be a linear transformation. We define the range or image of T as the 
set of vectors of M m which are of the form T (x) (equivalently, Ax) for some f gR" It is 
common to write TM n , T (M n ), or Im ( T ) to denote these vectors. 



Proof. This follows from the definition of matrix multiplication in Definition 2.13. □ 

This section is devoted to studying two important characterizations of linear transforma- 
tions, called one to one and onto. We define them now. 
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Definition 5.27: One to One 


Suppose X\ and x 2 are vectors in M. n . A linear transformation T : M n 1 — y is called 

one to one (often written as 1 — 1) if whenever x± 7^ X 2 it follows that : 

T(x{)±T(x 2 ) 

Equivalently, if T (x\) — T (x 2 ) , then X\ — x 2 . Thus, T is 1 — 1 if it never takes two 
different vectors to the same vector. 


The second important characterization is called onto. 



We often call a linear transformation which is one-to-one an injection. Similarly, a linear 
transformation which is onto is often called a surjection. 

The following proposition is an important result. 


Proposition 5.29: One to One 


Let Ta : K n •— > be a linear transformation induced by the m x n matrix A. Then 

Ta is one to one if and only ETa^x) = 0 implies x — 0. 


Proof. First note that we can rewrite the statement ll Ta(x) = 0 implies x = 0” in terms of 
the matrix A as u Ax = 0 implies x = 0”. Therefore we can prove this theorem using A. 

We need to prove two things here. First, we will prove that if A is one to one, then 
Ax = 0 implies that x = 0. Second, we will show that if Ax = 0 implies that x = 0, then it 
follows that A is one to one. 

First note that A0 = A ^0 + (fj = A0 + A0 and so A0 = 0. 

Now suppose A is one to one and Ax = 0. We need to show that this implies x = 0. 
Since A is one to one, by Definition 5.27 A can only map one vector to the zero vector 0. 
Now Ax = 0 and A0 = 0, so it follows that x = 0. Thus if A is one to one and Ax = 0, then 
x = 0. 

Next assume that Ax = 0 implies x = 0. We need to show that A is one to one. Suppose 
Ax = Ay. Then Ax — Ay = 0. Hence Ax — Ay = A (x — y) =0. However, we have assumed 
that Ax = 0 implies x = 0. This means that whenever A times a vector equals 0, that vector 
is also equal to 0. Therefore, x — y = 0 and so x = y. Thus A is one to one by Definition 
5.27. □ 

Note that this proposition says that if A — [ A\ ■ ■ ■ A n ] then A is one to one if and 
only if whenever 

n 

0 ^ ^ CfcHfc 

k= 1 
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it follows that each scalar Ck = 0. 

We will now take a look at an example of a one to one and onto linear transformation. 


Example 5.30: A One to One and Onto Linear Transformation 


Suppose 


X 


' i 

i ' 


X 

. y . 


i 

2 


. y . 


Then, T : M 2 — y M 2 is a linear transformation. Is T onto? Is it one to one? 


Solution. Recall that because T can be expressed as matrix multiplication, we know that T 


is a linear transformation. We will start by looking at onto. So suppose 

? If so, then since 


a 

b 


'. Does 


there exist 

X 

E M 2 such that T 

X 

= 

a 


y 


y 


b 


is an arbitrary 


vector in M 2 , it will follow that T is onto. 

This question is familiar to you. It is asking whether there is a solution to the equation 


' 1 

1 ' 


X 


a 

1 

2 


. y . 


b 


This is the same thing as asking for a solution to the following system of equations. 

x + y = a 
x + 2 y = b 

Set up the augmented matrix and row reduce. 

(5.4) 

You can see from this point that the system has a solution. Therefore, we have shown that 
for any a, b , there is a X such that T X — °? . Thus T is onto. 

[y \ [y \ [ b J 

Now we want to know if T is one to one. By Lemma 5.29 it is enough to show that 
Ax = 0 implies x — 0. Consider the system Ax = 0 given by: 


' 1 

1 

a 

->• 

' 1 

0 

2a — b 

1 

2 

b 

0 

1 

b — a 


' 1 

1 ' 


X 


' 0 ' 

1 

2 


. y . 


0 


This is the same as the system given by 

x + y = 0 
x + 2y = 0 

We need to show that the solution to this system is x = 0 and y — 0. By setting up the 
augmented matrix and row reducing, we end up with 


' i 

0 

i 

o 

i 

o 

i 

o 

1 
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This tells us that x = 0 and y — 0. Returning to the original system, this says that if 


then 


1 

1 


1 

2 


x 

y 


X 


. y . 



i — 

i 

o 

i 

ii 

i 

o 

1 


o 

o 


In other words, Ax = 0 implies that x = 0. By Proposition 5.29, A is one to one, and so 
T is also one to one. 

We also could have seen that T is one to one from our above solution for onto. By looking 
at the matrix given by 5.4, you can see that there is a unique solution given by x = 2 a — b 


and y = b — a. Therefore, there is only one vector, specifically 


X 


2 a — b 

. y . 


<3 

1 

.■o 

i 


T 


X 


a 

. y . 


b 


. Hence by Definition 5.27, T is one to one. 


such that 


□ 


Consider the following important definition. 



The above Example 5.30 demonstrated that the given transformation T is both one to 
one and onto. We can now say that this transformation is an isomorphism. 


5.5.1. Exercises 


1. Let T be a linear transformation given by 


X 


' 2 

i ' 

. y . 


0 

i 


Is T one to one? Is T onto? 


2. Let T be a linear transformation given by 


X 


" -i 

2 ' 

= 

2 

1 

. y . 


1 

4 


Is T one to one? Is T onto? 
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3. Let T be a linear transformation given by 


X 


' 2 

0 

1 ' 

. y . 


i 

2 

-1 


Is T one to one? Is T onto? 


4. Let T be a linear transformation given by 


X 


' i 

3 

0 

-5 ' 

= 

2 

2 

. y . 


2 

4 

-6 


Is T one to one? Is T onto? 


5. Give an example of a 3 x 2 matrix with the property that the linear transformation 
determined by this matrix is one to one but not onto. 


6. Suppose A is an m x n matrix in which m < n. Suppose also that the rank of A equals 
m. Show that the transformation T determined by A maps M n onto M m . Hint: The 
vectors eq, ■ • ■ ,e m occur as columns in the reduced row-echelon form for A. 


7. Suppose A is an m x n matrix in which m > n. Suppose also that the rank of A equals 
n. Show that A is one to one. Hint: If not, there exists a vector, x such that Ax = 0, 
and this implies at least one column of A is a linear combination of the others. Show 
this would require the rank to be less than n. 

8. Explain why an n x n matrix A is both one to one and onto if and only if its rank is 

n. 


5.6 The General Solution of a Linear System 


Outcomes 


A. Use linear transformations to determine the particular solution and general so- 
lution to a system of equations. 

B. Find the kernel of a linear transformation. 


Recall the definition of a linear transformation discussed above. T is a linear transfor- 
mation if whenever x,y are vectors and k. p are scalars, 

T (. kx + py) = kT (x) + pT (y) 
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Thus linear transformations distribute across addition and pass scalars to the outside. 

It turns out that we can use linear transformations to solve linear systems of equations. 
Indeed given a system of linear equations of the form Ax = b , one may rephrase this as 
T(x) = b where T is the linear transformation Ta induced by the coefficient matrix A. With 
this in mind consider the following definition. 



Recall that a system is called homogeneous if every equation in the system is equal to 0. 
Suppose we represent a homogeneous system of equations by T (x) = 0. It turns out that 
the x for which T (x) = 0 are part of a special set called the null space of T. We may also 
refer to the null space as the kernel of T , and we write ker (T). 

Consider the following definition. 



We may also refer to the kernel of T as the solution space of the equation T (x) = 0. 
Consider the following example. 


Example 5.34: The Kernel of the Derivative 


Let denote the linear transformation defined on /, the functions which are defined 
on R. and have a continuous derivative. Find ker . 


Solution. The example asks for functions / which the property that j- = 0. As you may 
know from calculus, these functions are the constant functions. Thus ker (^) is the set of 
constant functions. □ 

Definition 5.33 states that ker (T) is the set of solutions to the equation, 


T (x) = 0 


Since we can write T (x) as Ax, you have been solving such equations for quite some time. 
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We have spent a lot of time finding solutions to systems of equations in general, as well 
as homogeneous systems. Suppose we look at a system given by Ax = b, and consider the 
related homogeneous system. By this, we mean that we replace b by 0 and look at Ax = 0. 
It turns out that there is a very important relationship between the solutions of the original 
system and the solutions of the associated homogeneous system. In the following theorem, 
we use linear transformations to denote a system of equations. Remember that T (x) = Ax. 



Proof. Consider y — x p = y + (— l)x p . Then T (y — x p ) = T (y) — T ( x p ). Since y and x p are 
both solutions to the system, it follows that T (y) = b and T ( x p ) = b. 

Hence, T (y) — T ( x p ) = b — b = 0. Let x 0 = y — x p . Then, T ( x 0 ) = 0 so x 0 is a solution 
to the associated homogeneous system and so is in ker (T). □ 

Sometimes people remember the above theorem in the following form. The solutions to 
the system T (x) = b are given by x p + ker (T) where x p is a particular solution to T ( x ) = b. 

For now, we have been speaking about the kernel or null space of a linear transformation 
T. However, we know that every linear transformation T is determined by some matrix 
A. Therefore, we can also speak about the null space of a matrix. Consider the following 
example. 


Example 5.36: The Null Space of 

a Matrix 

1 

Let 




'1230' 


A = 

2 112 



4 5 7 2 


Find ker (H). Equivalently, find the solutions to the system of equations Ax = 0. 


Solution. We are asked to find jx : Ax = 0 j . In other words we want to solve the system, 
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X 


. Then this amounts to solving 


Ax = 0. Let x = 


y 

z 

w 



X 



12 3 0 




' o ' 

2 112 


y 

= 

0 

4 5 7 2 


z 


0 



w 




This is the linear system 

x + 2y + 3z = 0 
2x + y + z + 2w = 0 
4x + 5y + 7z + 2w = 0 

To solve, set up the augmented matrix and row reduce to find the reduced row-echelon form. 









' 1 

0 

1 

4 

0 ' 

' l 

2 

3 

0 

0 ' 




3 

3 

2 

1 

1 

2 

0 

->■ • 

• ->■ 

0 

1 

5 

3 

2 

3 

0 

4 

5 

7 

2 

0 



0 

0 

0 

0 

0 








This yields x = and y — — | z. Since ker (A) consists of the solutions to this 


system, it consists vectors of the form, 


\z - fw 


1 ■ 

3 


4 " 

3 

l w ~l z 


5 


2 

= z 

3 

+ W 

3 

z 


1 


0 

w 


0 _ 


1 


Consider the following example. 


Example 5.37: A General Solution 


The general solution of a linear system of equations is the set of all possible solutions. 
Find the general solution to the linear system, 






X 



' 1 

2 

3 

0 




9 ' 

2 

1 

1 

2 


y 

= 

7 

4 

5 

7 

2 


z 


25 






w 




given that 


X 


' 1 ' 

y 


1 

z 


2 

w 


1 


is one solution. 
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Solution. Note the matrix of this system is the same as the matrix in Example 5.36. There- 
fore, from Theorem 5.35, you will obtain all solutions to the above linear system by adding 
a particular solution x p to the solutions of the associated homogeneous system, x. One 
particular solution is given above by 


X 


' 1 ' 

y 


1 

z 


2 

w 


1 


(5.5) 


Using this particular solution along with the solutions found in Example 5.36, we obtain 
the following solutions, 


i ■ 

3 


4 - 

3 


' 1 ' 

5 


2 


1 

3 

+ W 

3 

+ 


1 


0 


2 

0 


1 


1 


Hence, any solution to the above linear system is of this form. □ 


5.6.1. Exercises 


1. Write the solution set of the following system as a linear combination of vectors 


' 1 

-1 2 ' 


X 


' 0 ' 

1 

3 

-2 1 

-4 5 


y 

z 

— 

1 

o o 


2. Using Problem 1 find the general solution to the following linear system. 


" 1 

-1 2 ' 


X 


' 1 ' 

1 

3 

-2 1 

-4 5 


y 

z 

— 

2 

4 


3. Write the solution set of the following system as a linear combination of vectors 


" 0 

-1 2 ' 


X 


" 0 " 

1 

1 

-2 1 

-4 5 


y 

z 

— 

1 

o o 


4. Using Problem 3 find the general solution to the following linear system. 


■ 0 

-1 2 ' 


X 


1 ' 

1 

1 

-2 1 

-4 5 


y 

z 

— 

1 
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5. Write the solution set of the following system as a linear combination of vectors. 


" 1 

1 

to 

1 


X 


" 0 ' 

1 

3 

-2 0 
-4 4 


y 

z 

— 

1 

o o 


6. Using Problem 5 find the general solution to the following linear system. 


' 1 

1 

to 

1 


X 


' 1 ' 

1 

3 

-2 0 
-4 4 


y 

z 

— 

2 

4 


7. Write the solution set of the following system as a linear combination of vectors 


1 

0 

1 

to 


X 


" 0 ' 

1 0 1 


y 

= 

0 

1-2 5 


z 


0 


8. Using Problem 7 find the general solution to the following linear system. 


' 0 

-1 2 ' 


X 


1 ' 

1 

1 

0 1 

-2 5 


y 

z 

= 

-1 

1 


9. Write the solution set of the following system as a linear combination of vectors 


■ 1 Oil' 


X 


" 0 ' 

1-110 


y 


0 

3-132 


z 


0 

3 3 0 3 


w 


0 


10. Using Problem 9 find the general solution to the following linear system. 


" 1 Oil' 


X 


' 1 ' 

1-110 


y 


2 

3-132 


z 


4 

3 3 0 3 


w 


3 


11. Write the solution set of the following system as a linear combination of vectors 


"110 1' 


X 


" 0 ' 

2 112 


y 


0 

10 11 


z 


0 

0 0 0 0 


w 


0 
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12. Using Problem 11 find the general solution to the following linear system. 


" 1 10 1' 


X 


2 ' 

2 112 


y 


-1 



r 

— 


1 011 


z 


-3 

0-111 


w 


0 


13. Write the solution set of the following system as a linear combination of vectors 


1 

o 

i 


X 


" 0 ' 

1-110 


y 


0 

3 112 


z 


0 

3 3 0 3 


w 


0 


14. Using Problem 13 find the general solution to the following linear system. 


' 1 10 1' 


X 


' 1 ' 

1-110 


y 


2 

3 112 


z 


4 

3 3 0 3 


w 


3 


15. Write the solution set of the following system as a linear combination of vectors 


' 1 10 1' 


X 


" 0 ' 

2 112 


y 


0 

1 011 


z 


0 

0-111 


w 


0 


16. Using Problem 15 find the general solution to the following linear system. 


" 1 10 1' 


X 


2 ' 

2 112 


y 


-1 

1 011 


z 


-3 

0-111 


w 


1 


17. Suppose Ax = b has a solution. Explain why the solution is unique precisely when 
Ax = 0 has only the trivial solution. 
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6. Complex Numbers 


6.1 Complex Numbers 


Outcomes 


A. Understand the geometric significance of a complex number as a point in the 
plane. 

B. Prove algebraic properties of addition and multiplication of complex numbers, 
and apply these properties. Understand the action of taking the conjugate of a 
complex number. 

C. Understand the absolute value of a complex number and how to find it as well 
as its geometric significance. 


Although very powerful, the real numbers are inadequate to solve equations such as 
x 2 + 1 = 0, and this is where complex numbers come in. We define the number i as the 
imaginary number such that i 2 = —1, and define complex numbers as those of the form 
z = a + bi where a and b are real numbers. We call this the standard form, or Cartesian 
form, of the complex number z. Then, we refer to a as the real part of z, and b as the 
imaginary part of z. ft turns out that such numbers not only solve the above equation, 
but in fact also solve any polynomial of degree at least 1 with complex coefficients. This 
property, called the Fundamental Theorem of Algebra, is sometimes referred to by saying C 
is algebraically closed. Gauss is usually credited with giving a proof of this theorem in 1797 
but many others worked on it and the first completely correct proof was due to Argand in 
1806. 

Just as a real number can be considered as a point on the line, a complex number 
z = a + bi can be considered as a point (a, b) in the plane whose x coordinate is a and 
whose y coordinate is b. For example, in the following picture, the point z = 3 + 2i can be 
represented as the point in the plane with coordinates (3, 2) . 

• z = (3, 2) = 3 + 2 i 


■» 
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Addition of complex numbers is defined as follows. 

(a + bi ) + (c + di ) = (a + c) + (b + d)i 

This addition obeys all the usual properties as the following theorem indicates. 



The proof of this theorem is left as an exercise for the reader. 

Now, multiplication of complex numbers is defined the way you would expect, recalling 
that i 2 = —1. 

(a + bi) (c + di) = ac + adi + bci + i 2 bd 
= (ac — bd) + (ad + be) i 

Consider the following examples. 



The following are important properties of multiplication of complex numbers. 
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Theorem 6.3: Properties of Multiplication of Complex Numbers 


Let z, w and v be complex numbers. Then, the following properties of multiplication 
hold. 

• Commutative Law for Multiplication 

zw = wz 

• Associative Law for Multiplication 

(. ZW ) V — z (wv) 

• Multiplicative Identity 

lz = z 

• Existence of Multiplicative Inverse 

For each z 7 ^ 0, there exists z _1 such that zz~ l = 1 

• Distributive Law 

z (w + v) = zw + zv 


You may wish to verify some of these statements. The real numbers also satisfy the 
above axioms, and in general any mathematical structure which satisfies these axioms is 
called a held. There are many other fields, in particular even finite ones particularly useful 
for cryptography, and the reason for specifying these axioms is that linear algebra is all about 
fields and we can do just about anything in this subject using any held. Although here, the 
helds of most interest will be the familiar held of real numbers, denoted as M, and the held 
of complex numbers, denoted as C. 

An important construction regarding complex numbers is the complex conjugate denoted 
by a horizontal line above the number, H. It is defined as follows. 



Geometrically, the action of the conjugate is to rehect a given complex number across 
the x axis. Algebraically, it changes the sign on the imaginary part of the complex number. 
Therefore, for a real number a, a — a. 
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Example 6.5: Conjugate of a Complex Number 


• If z — 3 + 4i, then z = 3 — Ai, i.e., 3 + Ai = 3 — Ai. 

• -2 + 5 i = -2 - hi. 

• i = —i. 

• 7 = 7. 

Consider the following computation. 

(a + bi) (a + bi) = (a — bi) (a + bi) 

= a 2 + b 2 — ( ab — ab ) i = a 2 + b 2 

Notice that there is no imaginary part in the product, thus multiplying a complex number 
by its conjugate results in a real number. 



Division of complex numbers is defined as follows. Let z = a + bi and w = c + di be 
complex numbers such that c, d are not both zero. Then the quotient z divided by w is 

z a + bi a + bi c — di 

w c + di c + di c — di 

(ac + bd) + {be — ad)i 
c 2 + d 2 

ac + bd be — ad . 
c 2 + d 2 c 2 + d 2 *' 

In other words, the quotient — is obtained by multiplying both top and bottom of ^ by 
Tu and then simplifying the expression. 
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Example 6.7: Division of Complex Numbers 


1 —i —i 




i —i —i 


;-2 




2 — i 


2 — i 3 — 4 i 

x 


(6 — 4) + (—3 — 8)i _ 2-11 i 


3 + 4 i 3 + 4 i 3-4 % 


3 2 + 4 2 


25 


2 

25 


11 

25 S 


1-2 i 


1-2 * —2 — 5i 

x 


—2 + 5i — 2 + 5i —2 — 5 i 


(—2 — 10) + (4 — 5)i 
2 2 + 5 2 


12 

29 




29 


Interestingly every nonzero complex number a + bi has a unique multiplicative inverse. 
In other words, for a nonzero complex number z, there exists a number z^ 1 (or i) so that 
zz^ 1 = 1. Note that z = a + bi is nonzero exactly when a 2 + b 2 ^ 0, and its inverse can be 
written in standard form as defined now. 



Note that we may write z 1 as Both notations represent the multiplicative inverse of 
the complex number z. Consider now an example. 
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Another important construction of complex numbers is that of the absolute value, also 
called the modulus. Consider the following definition. 


r i 

Definition 6.10: Absolute Value 

The absolute value, or modulus, o 

f a complex number, denoted z 

a + bi\ = Va 2 + b 2 

is defined as follows. 


Thus, if z is the complex number z = a + bi, it follows that 



Also from the definition, if z = a + bi and w = c + di are two complex numbers, then 
\zw\ = \z\ |w| . Take a moment to verify this. 

The triangle inequality is an important property of the absolute value of complex num- 
bers. There are two useful versions which we present here, although the first one is officially 
called the triangle inequality. 



Proof. Let z = a + bi and w = c + di. First note that 

zTu = (a + bi) (c — di) = ac + bd + (be — ad) i 

and so | ac + bd\ < \zw\ = \z\ |u>| . 

Then, 

\z + w\ 2 = (a + c + i (b + d)) (a + c — i (b + d)) 

— (a T c) T (b T d) = a~ T (V T 2 ac T 2 bd T if T df 
< \z\ 2 + \w\ 2 + 2 \z\ |w| = (|z| + \w\) 2 
Taking the square root, we have that 

\z + w\ < \z\ + \w\ 

so this verihes the triangle inequality. 

To get the second inequality, write 


z = z — w + w, w = w — z + z 
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and so by the first form of the inequality we get both: 

\z\ < \z — w\ + \w\ , |w| < \z — w\ + \z\ 

Hence, both \z\ — |w| and |w| — \z\ are no larger than | z — w |. This proves the second 
version because ||z| — |w|| is one of \z\ — |w| or |tc| — \z\. □ 

With this definition, it is important to note the following. You may wish to take the time 
to verify this remark. 

Let z = a + bi and w — c + di. Then \z — w\ = \J {a — cf + (b — df. Thus the distance 
between the point in the plane determined by the ordered pair (a, b) and the ordered pair 
(c, d) equals \z — w\ where z and w are as just described. 

For example, consider the distance between (2,5) and (1,8). Letting z = 2 + hi and 
w = 1 + 8i, z — w = 1 — 3i , (z — w) (z — w) = (1 — 3i) (1 + 3 i) = 10 so \z — w\ = \/l0. 

Recall that we refer to z = a + bi as the standard form of the complex number. In the 
next section, we examine another form in which we can express the complex number. 


6.1.1. Exercises 

1. Let z = 2 + 7i and let w = 3 — Si. Compute the following. 

(a) z + w 

(b) z — 2w 

(c) zw 

(d) w 

2. Let z — 1 — 4i. Compute the following. 

(a) ~z 

(b) z~ x 

(c) \z\ 

3. Let z = 3 + hi and w = 2 — i. Compute the following. 

(a) zw 

(b) \zw\ 

(c) z _1 w 

4. If z is a complex number, show there exists a complex number w with |w| = 1 and 
wz = El . 
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5. If z,w are complex numbers prove ~zw = z w and then show by induction that 

Z\ ■ ■ ■ z m = Also verify that Ylk=i z k = I 11 words this says the 

conjugate of a product equals the product of the conjugates and the conjugate of a 
sum equals the sum of the conjugates. 

6. Suppose p (x) = a n x n + a n _ + • • • + a i x + ao where all the a*, are real numbers. 
Suppose also that p(z) — 0 for some z £ C. Show it follows that p(z) — 0 also. 

7. I claim that 1 = — 1. Here is why. 

-1 = i 2 = = y](-l f = Vl = l 

This is clearly a remarkable result but is there something wrong with it? If so, what 
is wrong? 

6.2 Polar Form 


Outcomes 


A. Convert a complex number from standard form to polar form, and from polar 
form to standard form. 


In the previous section, we identified a complex number z = a + bi with a point (a, b ) 
in the coordinate plane. There is another form in which we can express the same number, 
called the polar form. The polar form is the focus of this section. It will turn out to be very 
useful if not crucial for certain calculations as we shall soon see. 

Suppose z = a + bi is a complex number, and let r = \/ a 2 + b 2 — \z\. Recall that r is the 
modulus of z . Note first that 



and so (y, is a point on the unit circle. Therefore, there exists an angle 6 (in radians) 
such that 

a a ■ n b 

cos a = -, sm U = - 

In other words 9 is an angle such that a = rcosd and b = rsin#, that is 6 = cos -1 (a/r) and 
9 = sin _1 (6/r). We call this angle 6 the argument of z. 

We often speak of the principal argument of 0 . This is the unique angle 6 G (— n, 1 r] 
such that 

a a • a b 

cos 9 = -, smd = - 
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The polar form of the complex number z = a + bi = r (cos 9 + i sin 6) is for convenience 
written as: 

if) 

z = re 

where 9 is the argument of z. 



When given z = re l9 , the identity e* 9 = cos 9 + i sin 9 will convert z back to standard 
form. Here we think of e l ° as a short cut for cos 9 + i sin 9. This is all we will need in this 
course, but in reality e l ° can be considered as the complex equivalent of the exponential 
function where this turns out to be a true equality. 



Thus we can convert any complex number in the standard (Cartesian) form z = a + bi 
into its polar form. Consider the following example. 



Solution. First, find r. By the above discussion, r = \J a 2 + b 2 = \z\. Therefore, 

r = \/2 2 + 2 2 = y/8 = 2V2 

Now, to find 9 , we plot the point (2, 2) and find the angle from the positive x axis to 
the line between this point and the origin. In this case, 9 = 45° = That is we found the 
unique angle 9 such that 9 = cos _1 (l/v / 2) and 9 = sin _1 (1/V2). 

Note that in polar form, we always express angles in radians, not degrees. 

Hence, we can write z as 
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z = 2y/2e^ 


□ 

Notice that the standard and polar forms are completely equivalent. That is not only 
can we transform a complex number from standard form to its polar form, we can also take 
a complex number in polar form and convert it back to standard form. 



Solution. Let z = 2e 27 ™/ 3 be the polar form of a complex number. Recall that e 1 9 = 
cos 6 + zsind. Therefore using standard values of sin and cos we get: 

z = 2e* 27r/3 = 


which is the standard form of this complex number. □ 

You can always verify your answer by converting it back to polar form and ensuring you 
reach the original answer. 


2(cos(27r/3) + i sin(27r/3)) 



-1 + VSi 


6.2.1. Exercises 

1. Let z = 3 + 3z be a complex number written in standard form. Convert z to polar 
form, and write it in the form z = re lB . 

2. Let z = 2i be a complex number written in standard form. Convert z to polar form, 
and write it in the form z = re l6 . 

2tt ■ 

3. Let ^ = 4e“3“' be a complex number written in polar form. Convert z to standard form, 
and write it in the form z = a + bi. 

4. Let z = —lei* be a complex number written in polar form. Convert z to standard 
form, and write it in the form z = a + bi. 

5. If z and w are two complex numbers and the polar form of z involves the angle 9 while 
the polar form of w involves the angle 0, show that in the polar form for zw the angle 
involved is 0 + 0. 
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6.3 Roots of Complex Numbers 


Outcomes 


A. Understand De Moivre’s theorem and be able to use it to find the roots of a 
complex number. 


A fundamental identity is the formula of De Moivre with which we begin this section. 



Proof. The proof is by induction on n. It is clear the formula holds if n — 1. Suppose it is 
true for n. Then, consider n + 1 . 

(r (cos # + 7sin#)) n+1 = (r (cos# + isin#)) n (r (cos# + isin#)) 

which by induction equals 

= r n+l (cos n6 + i sin n6) (cos # + i sin #) 

= r n+1 ( (cos n# cos # — sin n# sin 9) + i (sin nO cos # + cos n# sin #) ) 

= r n+1 (cos (n + l)9 + i sin [n + 1) #) 

by the formulas for the cosine and sine of the sum of two angles. □ 

The process used in the previous proof, called mathematical induction is very powerful 
in Mathematics and Computer Science and explored in more detail in the Appendix. 

Now, consider a corollary of Theorem 6.15. 



Proof. Let z = a + bi and let z — \z\ (cos # + i sin #) be the polar form of the complex number. 
By De Moivre’s theorem, a complex number 

w = re ia = r (cos a + i sin a) 
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is a k th root of z if and only if 


w k = (re ia ) k = r k e lka = r k (cos ka + i sin ka ) — \z\ (cos 9 + i sin 9) 

This requires r k = \z\ and so r = \z\^ k . Also, both cos (ka) = cos 9 and sin (ka) 
This can only happen if 

ka = 9 + 


for £ an integer. Thus 


a = d + 2en € = 0, 1, 2, • - - ,k- 1 
k 


and so the k th roots of z are of the form 


, i i/fc l [9 + 2£n^\ . . (9 + 2£n 

\Z\ ' ( COS | : I + % sin 


, £ = 0, 1, 2, - - - ,k-l 


sin 9. 


Since the cosine and sine are periodic of period 2n, there are exactly k distinct numbers 
which result from this formula. □ 

The procedure for finding the k k th roots of z G C is as follows. 
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Procedure 6.17: Finding Roots of a Complex Number 


Let w be a complex number. We wish to find the n th roots of w, that is all z such that 
z n = w. 

There are n distinct n th roots and they can be found as follows:. 

1. Express both z and w in polar form z = re t9 , w = se 1 ^ . Then z n = w becomes: 

(re i6 ) n = r n e in6 = se i(/> 

We need to solve for r and 9. 

2. Solve the following two equations: 


e in6 _ e i</> 


( 6 . 1 ) 


3. The solutions to r n = s are given by r = t/s. 

4. The solutions to e in9 = e 1 ^ are given by: 

n6 = (f) + 27r£, for £ = 0, 1, 2, • • • , n — 1 
or 

6 = — H — tt£, for £ = 0, 1, 2, • • • , n — 1 
n n 

5. Using the solutions r, 9 to the equations given in (6.1) construct the n th roots of 
the form z = re ie . 


Notice that once the roots are obtained in the final step, they can then be converted to 
standard form if necessary. Let’s consider an example of this concept. Note that according 
to Corollary 6.16, there are exactly 3 cube roots of a complex number. 



Solution. First, convert each number to polar form: z = re ld and i = le m ^ 2 . The equation 
now becomes 

(re i9 ) 3 = r 3 e 3i9 = le* 71 ^ 2 

Therefore, the two equations that we need to solve are r 3 = 1 and 3 i9 = iir/2. Given that 
r e R and r 3 = 1 it follows that r — 1. 

Solving the second equation is as follows. First divide by i. Then, since the argument of 
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i is not unique we write 39 = 7t/2 + 2n£ for £ = 0,1,2. 

39 = tt/2 + 2tt£ for l = 0, 1, 2 


For £ = 0: 

For £ = 1: 

For £ = 2: 


0 = 7t/6 + -7t£ for £ = 0, 1, 2 
o 


6 = 7r/6 H — 7r(0) = 7r/6 
3 

9 = 7t/ 6 + |vr(l) = ~7T 
0 6 

0 = vr/6 + jjvr(2) = ~vr 


Therefore, the three roots are given by 

le i7r/6 , le^, 

Written in standard form, these roots are, respectively, 

V3 .1 V3 .1 . 

~h X — , ~b X — , — X 

2 2 2 2 

The ability to find k th roots can also be used to factor some polynomials. 


Example 6.19: Solving a Polynomial Equation 


Factor the polynomial x 3 — 27. 


Solution. First find the cube roots of 27. By the above procedure , these cube roots are 

3, 3 ^“ 2 “ + > and 3 ‘ ^ ou ma 7 w i s h f° verify this using the above steps. 

Therefore, x 3 — 27 = 

(*-3)p-3|2 + L_ 3 (=±_i}ft 


Note also ( x — 3 ( -£■ + i^y \ ) (x — 3 1 = x 2 + 3x + 9 and so 


x 3 - 27 = 0 - 3) ( x 2 + 3x + 9) 

where the quadratic polynomial x 2 + 3a: + 9 cannot be factored without using complex 


numbers. 


□ 


Note that even though the polynomial x 3 — 27 has all real coefficients, it has some complex 

3 (2 + ^ j , and 3 (2 _ 4) . These seres are eo.p.ex con jug a t es of eaeh 

other. It is always the case that if a polynomial has real coefficients and a complex root, it 
will also have a root equal to the complex conjugate. 
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6.3.1. Exercises 


1. Give the complete solution to x 4 + 16 = 0. 

2. Find the complex cube roots of 8. 

3. Find the four fourth roots of 16. 

4. De Moivre’s theorem says [r (cost + i sinf)]” = r n (cos nt + i sin nt) for n a positive 
integer. Does this formula continue to hold for all integers n, even negative integers? 
Explain. 

5. Factor x 3 + 8 as a product of linear factors. Hint: Use the result of 2. 

6. Write x 3 + 27 in the form (x + 3) (x 2 + ax + b) where x 2 + ax + b cannot be factored 
any more using only real numbers. 

7. Completely factor x 4 + 16 as a product of linear factors. Hint: Use the result of 3. 

8. Factor x 4 + 16 as the product of two quadratic polynomials each of which cannot be 
factored further without using complex numbers. 

9. If n is an integer, is it always true that (cos 6 — i sin 9)" = cos ( n9 ) —i sin ( n9)l Explain. 

10. Suppose p (x) = a n x n + a n - \x n ~ l + • • • + cqx + ao is a polynomial and it has n zeros, 


Zl,Z 2 , ■■■ ,z n 


listed according to multiplicity. ( z is a root of multiplicity m if the polynomial / (x) = 
(x — z) m divides p (x) but (x — z) f (x) does not.) Show that 

p (x) = a n (x — zi) (x - z 2 ) ■ ■ ■ (x - z n ) 

6.4 The Quadratic Formula 


Outcomes 


A. Use the Quadratic Formula to find the complex roots of a quadratic equation. 


The roots (or solutions) of a quadratic equation ax 2 + bx + c = 0 where a, b, c are real 
numbers are obtained by solving the familiar quadratic formula given by 

—b ± \/b 2 — 4ac 
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When working with real numbers, we cannot solve this formula if b 2 — 4ac < 0. However, 
complex numbers allow us to find square roots of negative numbers, and the quadratic 
formula remains valid for finding roots of the corresponding quadratic equation. In this case 
there are exactly two distinct (complex) square roots of b 2 — 4ac, which are i\/4ac — b 2 and 
—i\/Aac — b 2 . 

Here is an example. 



Solution. In terms of the quadratic equation above, a = 1, b = 2, and c = 5. Therefore, we 
can use the quadratic formula with these values, which becomes 


x = 


- b ± - 4ac -2 ± \/(2) 2 - 4 (l)(5) 


2 a 2(1) 

Solving this equation, we see that the solutions are given by 


x = 


-2 i ± V4-20 -2 ± 4 i 


= -1 ± 2 i 


We can verify that these are solutions of the original equation. We will show x = 
and leave x — — 1 — 2i as an exercise. 


-1 + 2 i 


x 2 + 2x + 5 


(— 1 + 2i) 2 + 2(— 1 + 2i) + 5 
1 — 4f — 4 — 2 + 4z + 5 
0 


Hence x = — 1 + 2i is a solution. 


□ 


What if the coefficients of the quadratic equation are actually complex numbers? Does 
the formula hold even in this case? The answer is yes. This is a hint on how to do Problem 4 
below, a special case of the fundamental theorem of algebra, and an ingredient in the proof 
of some versions of this theorem. 

Consider the following example. 



Solution. In terms of the quadratic equation above, a = 1, b = —2 i, and c = —5. Therefore, 
we can use the quadratic formula with these values, which becomes 

—b ± \/b 2 — 4 ac 2i ± \J (—2 i) — 4(1)(— 5) 

~ 2 ( 1 ) 

288 


x = 


2a 


Solving this equation, we see that the solutions are given by 


2 i ± V-4 + 20 


2i ±4 
2 


i± 2 


We can verify that these are solutions of the original equation. We will show x = i + 2 
and leave x = i — 2 as an exercise. 


x 2 — 2ix — 5 


[i + 2 ) * 1 2 3 - 2i(i + 2) - 5 
-1 + 4i + 4 + 2 - 4i - 5 
0 


Hence x = i + 2 is a solution. □ 

We conclude this section by stating an essential theorem. 


Theorem 6.22: The Fundamental Theorem of Algebra 


Any polynomial of degree at least 1 with complex coefficients has a root which is a 
complex number. 


6.4.1. Exercises 

1. Show that 1 + i, 2 + * are the only two roots to 

p (x) = x 2 — (3 + 2i) x + (1 + 3 i) 

Hence complex zeros do not necessarily come in conjugate pairs if the coefficients of 
the equation are not real. 

2. Give the solutions to the following quadratic equations having real coefficients. 

(a) x 2 — 2x + 2 = 0 

(b) 3x 2 + x + 3 = 0 

(c) x 2 — 6x + 13 = 0 

(d) x 2 + 4x + 9 = 0 

(e) 4x 2 + 4x + 5 = 0 

3. Give the solutions to the following quadratic equations having complex coefficients. 

(a) x 2 + 2x + 1 + i = 0 

(b) 4x 2 + 4 ix — 5 = 0 
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(c) 4x 2 + (4 + 4i) x + 1 + 2i = 0 

(d) x 2 — 4 ix — 5 = 0 

(e) 3x 2 + (1 — i) x + 3i = 0 

4. Prove the fundamental theorem of algebra for quadratic polynomials having coefficients 
in C. That is, show that an equation of the form 

ax 2 + bx + c = 0 where a, b, c are complex numbers, a / 0 has a complex solution. 
Hint: Consider the fact, noted earlier that the expressions given from the quadratic 
formula do in fact serve as solutions. 
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7. Spectral Theory 


7.1 Eigenvalues and Eigenvectors of a Matrix 


Outcomes 


A. Describe eigenvalues geometrically and algebraically. 

B. Find eigenvalues and eigenvectors for a square matrix. 

Spectral Theory refers to the study of eigenvalues and eigenvectors of a matrix. It is of 
fundamental importance in many areas and is the subject of our study for this chapter. 

7.1.1. Definition of Eigenvectors and Eigenvalues 


In this section, we will work with the entire set of complex numbers, denoted by C. Recall 
that the real numbers, R. are contained in the complex numbers, so the discussions in this 
section apply to both real and complex numbers. 

To illustrate the idea behind what will be discussed, consider the following example. 


Example 7.1: Eigenvectors and Eigenvalues 


Let 


A 


Compute the product AX for 


0 5 -10 

0 22 16 
0 -9 -2 



5 ' 


' 1 ' 

X = 

-4 

,x = 

0 


3 


0 


What do you notice about AX in each of these products? 
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Solution. First, compute AX for 


This product is given by 


X = 


5 

-4 

3 



" 0 

5 

-10 ' 


" -5 ' 


" -50 ' 


" -5 ' 

AX = 

0 

22 

16 


-4 

= 

-40 

= 10 

-4 


0 

-9 

-2 


3 


30 


3 


In this case, the product AX resulted in a vector which is equal to 10 times the vector 
X. In other words, AX = 10A. 

Let’s see what happens in the next product. Compute AX for the vector 


X = 


1 

0 

0 


This product is given by 


" 0 

5 

-10 ' 


' 1 ' 


" 0 ' 


' 1 " 

0 

22 

16 


0 

= 

0 

= 0 

0 

0 

-9 

-2 


0 


0 


0 


In this case, the product AX resulted in a vector equal to 0 times the vector A", AX = 0A. 
Perhaps this matrix is such that AX results in kX , for every vector X. However, consider 


" 0 5 -10 ' 


' 1 ' 


-5 ' 

0 22 16 


1 

= 

38 

0 -9 -2 


1 


-11 


In this case, AX did not result in a vector of the form kX for some scalar k. □ 

There is something special about the first two products calculated in Example 7.1. Notice 
that for each, AX = kX where k is some scalar. When this equation holds for some X and 
k, we call the scalar k an eigenvalue of A. We often use the special symbol A instead of k 
when referring to eigenvalues. In Example 7.1, the values 10 and 0 are eigenvalues for the 
matrix A and we can label these as Ai = 10 and A 2 = 0. 

When AX = XX for some X 0, we call such an X an eigenvector of the matrix A. 
The eigenvectors of A are associated to an eigenvalue. Hence, if Ai is an eigenvalue of A 
and AX = AjX, we can label this eigenvector as Ad. Note again that in order to be an 
eigenvector, X must be nonzero. 

There is also a geometric significance to eigenvectors. When you have a nonzero vector 
which, when multiplied by a matrix results in another vector which is parallel to the first or 
equal to 0, this vector is called an eigenvector of the matrix. This is the meaning when the 
vectors are in M n . 

The formal definition of eigenvalues and eigenvectors is as follows. 
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Definition 7.2: Eigenvalues and Eigenvectors 


Let A be an n x n matrix and let X e C n be a nonzero vector for which 

AX = AX (7.1) 

for some scalar A. Then A is called an eigenvalue of the matrix A and X is called an 
eigenvector of A associated with A, or a A -eigenvector of A. 

The set of all eigenvalues of an n x n matrix A is denoted by a (A) and is referred to 
as the spectrum of A. 


The eigenvectors of a matrix A are those vectors X for which multiplication by A results 
in a vector in the same direction or opposite direction to X. Since the zero vector 0 has no 
direction this would make no sense for the zero vector. As noted above, 0 is never allowed 
to be an eigenvector. 

Let’s look at eigenvectors in more detail. Suppose A" satisfies 7.1. Then 

AX - AX = 0 
or 

(A - A I) X = 0 

for some A" ^ 0. Equivalently you could write (A/ — A) X = 0, which is more commonly 
used. Hence, when we are looking for eigenvectors, we are looking for nontrivial solutions to 
this homogeneous system of equations! 

Recall that the solutions to a homogeneous system of equations consist of basic solutions, 
and the linear combinations of those basic solutions. In this context, we call the basic 
solutions of the equation (A I — A) X = 0 basic eigenvectors. It follows that any (nonzero) 
linear combination of basic eigenvectors is again an eigenvector. 

Suppose the matrix (A/ — A) is invertible, so that (A/ — A) 1 exists. Then the following 
equation would be true. 


A = IX 

= ((A/ - A)~ l (A/ - A)) X 
= (A/ — A ) -1 ((A/ — A) X) 

= (A/ — A) -1 0 

= 0 

This claims that X = 0. However, we have required that X ^ 0. Therefore (A I — A) cannot 
have an inverse! 

Recall from Theorem 3.33 that if a matrix is not invertible, then its determinant is equal 
to 0. Therefore we can conclude that 


det (A/ — A) = 0 

Note that this is equivalent to det ( A — A I) = 0. 


(7.2) 
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The expression det (xl — A) is a polynomial (in the variable x ) called the characteristic 
polynomial of A, and det (xl — A) = 0 is called the characteristic equation. For this 
reason we may also refer to the eigenvalues of A as characteristic values, but the former 
is often used for historical reasons. 

The following theorem claims that the roots of the characteristic polynomial are the 
eigenvalues of A. Thus when 7.2 holds, A has a nonzero eigenvector. 



Proof. For A an nxn matrix, the method of Laplace Expansion demonstrates that det (A I — A) 
is a polynomial of degree n. As such, the equation 7.2 has a solution A G C by the Funda- 
mental Theorem of Algebra. The fact that A is an eigenvalue follows from Theorem 3.33 and 
is left as an exercise. □ 


7.1.2. Finding Eigenvectors and Eigenvalues 


Now that eigenvalues and eigenvectors have been defined, we will study how to find them 
for a matrix A. 

First, consider the following definition. 


Definition 7.4: Multiplicity of an Eigenvalue 


Let A be an n x n matrix with characteristic polynomial given by det (xl — A). Then, 
the multiplicity of an eigenvalue X of A is the number of times A occurs as a root of 
that characteristic polynomial. 


For example, suppose the characteristic polynomial of A is given by (x — 2) 2 . Solving for 
the roots of this polynomial, we set (x — 2) 2 = 0 and solve for x. We find that A = 2 is a 
root that occurs twice. Hence, in this case, A = 2 is an eigenvalue of A of multiplicity equal 
to 2. 

We will now look at how to find the eigenvalues and eigenvectors for a matrix A in detail. 
The steps used are summarized in the following procedure. 


294 



Procedure 7.5: Finding Eigenvalues and Eigenvectors 


Let A be an n x n matrix. 

1. First, find the eigenvalues X of A by solving the equation det (xl — A) = 0. 

2. For each X, find the basic eigenvectors I ^ 0 by finding the basic solutions to 
(XI - A) X = 0. 

To verify your work, make sure that AX = AA" for each X and associated eigenvector 
X. 


We will explore these steps further in the following example. 


Example 7.6: Find the Eigenvalues and Eigenvectors 

Let A = 

'-52' 
-7 4 

. Find its eigenvalues and eigenvectors. 


Solution. We will use Procedure 7.5. First we find the eigenvalues of A by solving the 
equation 

det (xl — A) = 0 

This gives 


det ( x 


1 0 
0 1 

det 


-5 2 
-7 4 


x + 5 -2 

7 x — 4 


Computing the determinant as usual, the result is 


x 2 + x — 6 = 0 


0 

0 


Solving this equation, we find that Ai = 2 and A 2 = —3. 

Now we need to find the basic eigenvectors for each A. First we will find the eigenvectors 
for Ai = 2. We wish to find all vectors A" ^ 0 such that AX = 2A. These are the solutions 
to (21 - A) X = 0. 


1 

0 


0 

1 


-5 2 ' 


X 


' o ' 

1 

) 

. y . 


0 

-a 

I 

to 



X 


' o ' 

7 -2 



y 


0 


The augmented matrix 
given by 


for this system and corresponding reduced row-echelon form are 


7-2 0 
7-2 0 




1 “I | 0 

0 0 0 
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The solution is any vector of the form 



Multiplying this vector by 7 we obtain a simpler description for the solution to this 
system, given by 



This gives the basic eigenvector for Ai = 2 as 



To check, we verify that AX = 2X for this basic eigenvector. 



This is what we wanted, so we know this basic eigenvector is correct. 

Next we will repeat this process to find the basic eigenvector for A 2 = —3. We wish to 
find all vectors 1^0 such that AX = — 3X. These are the solutions to ((—3)/ — A)X = 0. 



The augmented matrix for this system and corresponding reduced row-echelon form are 
given by 

' 2 -2 0 1 r 1 -1 0 ' 

7 —7 0 J ~ ^ *■ [ 0 0 0 

The solution is any vector of the form 



This gives the basic eigenvector for A 2 = —3 as 



To check, we verify that AX = — 3A" for this basic eigenvector. 



This is what we wanted, so we know this basic eigenvector is correct. □ 
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The following is an example using Procedure 7.5 for a 3 x 3 matrix. 


r 1 

Example 7.7: Find the Eigenvalues and Eigenvectors 

Find the eigenvalues and eigenvectoi 

A = 

"s for the matrix 

5 -10 -5 ' 
2 14 2 

-4 -8 6 



Solution. We will use Procedure 7.5. First we need to find the eigenvalues of A. Recall that 
they are the solutions of the equation 


det (xl — A) = 0 


In this case the equation is 



f 

1 

o 

o 

1 


det 

X 

0 1 0 

— 



0 0 1 



5 

2 


-10 

14 


-5 

2 


-4 -8 6 


which becomes 


det 


x — 5 10 5 

-2 x - 14 -2 

4 8 x — 6 


= 0 


= 0 


Using Laplace Expansion, compute this determinant and simplify. The result is the 
following equation. 

(x - 5) (x 2 - 20x + 100) = 0 

Solving this equation, we End that the eigenvalues are Ai = 5, A 2 = 10 and A 3 = 10. 
Notice that 10 is a root of multiplicity two due to 

\2 


X 


20x + 100 = (x- 10)' 


Therefore, A 2 = 10 is an eigenvalue of multiplicity two. 

Now that we have found the eigenvalues for A, we can compute the eigenvectors. 

First we will hnd the basic eigenvectors for Ai = 5. In other words, we want to find all non- 
zero vectors X so that AX = 5X. This requires that we solve the equation (5 1 — A) X = 0 
for X as follows. 


o 

o 

1 


5 -10 -5 ' 



X 


" 0 ' 

0 1 0 

— 

2 14 2 



y 

= 

0 

0 0 1 


-4 -8 6 

> 


z 


0 


That is you need to find the solution to 


0 10 5 ' 


X 


" 0 ' 

-2 -9 -2 


y 

= 

0 

4 8-1 


z 


0 


297 


By now this is a familiar problem. You set up the augmented matrix and row reduce to 
get the solution. Thus the matrix you must row reduce is 


0 

10 

5 

0 ' 

-2 

-9 

-2 

0 

4 

8 

-1 

0 


The reduced row-echelon form is 


'10-f 

1 

o 

0 1 i 

0 

0 0 0 

1 

o 


and so the solution is any vector of the form 


r i 


5 - 

4 S 


4 

i „ 


1 

-2 S 

= s 

2 

S 


1 


where s 6 M. If we multiply this vector by 4, we obtain a simpler description for the solution 
to this system, as given by 


t 


5 

-2 

4 


(7.3) 


where ief. Here, the basic eigenvector is given by 


Ab = 


5 

-2 

4 


Notice that we cannot let t — 0 here, because this would result in the zero vector and 
eigenvectors are never equal to 0! Other than this value, every other choice of t in 7.3 results 
in an eigenvector. 

It is a good idea to check your work! To do so, we will take the original matrix and 
multiply by the basic eigenvector Xi. We check to see if we get 5X t . 


1 

Or 

1 

0 

1 

01 


5 ' 


25 ' 


5 ' 

2 14 2 


-2 

= 

-10 

= 5 

-2 

-4 -8 6 


4 


20 


4 


This is what we wanted, so we know that our calculations were correct. 

Next we will find the basic eigenvectors for A 2 ,A 3 = 10. These vectors are the basic 
solutions to the equation, 


( 

' 1 

0 

0 ' 


10 

0 

1 

0 

— 

V 

0 

0 

1 



-10 

-5 ' 

\ 

X 


" 0 ' 

14 

2 


y 

= 

0 

-8 

6 

/ 

z 


0 
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That is you must find the solutions to 


5 10 5 ' 


X 


" 0 ' 

-2 -4 -2 


y 

= 

0 

4 8 4 


z 


0 


Consider the augmented matrix 


5 

10 

5 

0 ' 

-2 

-4 

-2 

0 

4 

8 

4 

0 


The reduced row-echelon form for this matrix is 


1 

2 

1 

1 

O 

0 

0 

0 

0 

0 

0 

0 

0 


and so the eigenvectors are of the form 


—2s — t 


" -2 ' 


" -1 ' 

s 

= s 

1 

+ t 

0 

t 


0 


1 


Note that yon can’t pick t and s both equal to zero because this would result in the zero 
vector and eigenvectors are never equal to zero. 

Here, there are two basic eigenvectors, given by 


AT = 

" -2 ' 

1 

CO 

1 

0 h- ^ 

1 


1 

O 

1 




Taking any (nonzero) linear combination of X 2 and A3 will also result in an eigenvector 
for the eigenvalue A = 10. As in the case for A = 5, always check your work! For the first 
basic eigenvector, we can check AX 2 = 10A" 2 as follows. 


1 

1 

O 

1 

Or 

1 


" -1 ' 


‘ -10 ' 


" -1 ' 

2 14 2 


0 

= 

0 

= 10 

0 

-4 -8 6 


1 


10 


1 


This is what we wanted. Checking the second basic eigenvector, X 3 , is left as an exercise. □ 

It is important to remember that for any eigenvector X, X 7^ 0. However, it is possible 
to have eigenvalues equal to zero. This is illustrated in the following example. 
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Solution. First we find the eigenvalues of A. We will do so using Definition 7.2. 
In order to find the eigenvalues of A, we solve the following equation. 


det (xl — A) = det 


x — 2 -2 2 

-1 x — 3 1 

1 -1 x-1 


= 0 


This reduces to x 3 — 6x 2 + 8x = 0. You can verify that the solutions are Ai = 0, A2 = 
2, A3 = 4. Notice that while eigenvectors can never equal 0, it is possible to have an eigenvalue 
equal to 0. 

Now we will find the basic eigenvectors. For Ai = 0, we need to solve the equation 
(0/ — A) X = 0. This equation becomes —AX = 0, and so the augmented matrix for finding 
the solutions is given by 


The reduced row-echelon form is 


" -2 -2 2 

1 

O 

-1 -3 1 

0 

I 

1 

1 

O 


1 

0 


0 0 

Therefore, the eigenvectors are of the form t 
given by 

X 1 = 


-1 

0 

0 

1 " 
0 
1 

' 1 

0 

1 


where t 7^ 0 and the basic eigenvector is 


We can verify that this eigenvector is correct by checking that the equation AX\ = OAd 
holds. The product AX\ is given by 


AX 1 = 


2 2 
1 3 
-1 1 


-2 

-1 

1 



' 1 ' 


1 

0 


0 

= 

0 


1 


1 

0 


This clearly equals OAd, so the equation holds. Hence, AX\ = OAd and so 0 is an 
eigenvalue of A. 

Computing the other basic eigenvectors is left as an exercise. □ 

In the following sections, we examine ways to simplify this process of finding eigenvalues 
and eigenvectors by using properties of special types of matrices. 


7.1.3. Eigenvalues and Eigenvectors for Special Types of Matrices 


There are three special kinds of matrices which we can use to simplify the process of finding 
eigenvalues and eigenvectors. Throughout this section, we will discuss similar matrices, 
elementary matrices, as well as triangular matrices. 
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We begin with a definition. 



It turns out that we can use the concept of similar matrices to help us find the eigenvalues 
of matrices. Consider the following lemma. 



Proof. We need to show two things. First, we need to show that if A = P~ l BP , then A and 
B have the same eigenvalues. Secondly, we show that if A and B have the same eigenvalues, 
then A = P~ 1 BP. 

Here is the proof of the first statement. Suppose A = P~ 1 BP and A is an eigenvalue of 
A. that is AX = XX for some X f 0. Then 

P~ l BPX = XX 


and so 

BPX = XPX 

Since P is one to one and X 0, it follows that PX 0. Here, PX plays the role of 
the eigenvector in this equation. Thus A is also an eigenvalue of B. One can similarly verify 
that any eigenvalue of B is also an eigenvalue of A, and thus both matrices have the same 
eigenvalues as desired. 

Proving the second statement is similar and is left as an exercise. □ 

Note that this proof also demonstrates that the eigenvectors of A and B will (generally) 
be different. We see in the proof that AX = XX, while B ( PX ) = A (PX). Therefore, for 
an eigenvalue A, A will have the eigenvector X while B will have the eigenvector PX. 

The second special type of matrices we discuss in this section is elementary matrices. 
Recall from Definition 2.43 that an elementary matrix E is obtained by applying one row 
operation to the identity matrix. 

It is possible to use elementary matrices to simplify a matrix before searching for its 
eigenvalues and eigenvectors. This is illustrated in the following example. 
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Example 7.11: Simplify Using Elementary Matrices 


Find the eigenvalues for the matrix 


33 

105 

105 

10 

28 

30 

-20 

-60 

-62 


Solution. This matrix has big numbers and therefore we would like to simplify as much as 
possible before computing the eigenvalues. 

We will do so using row operations. First, add 2 times the second row to the third row. 
To do so, left multiply A by E( 2,2). Then right multiply A by the inverse of E( 2,2) as 
illustrated. 

' 33 -105 105 
10 -32 30 

0 0-2 

By Lemma 7.10, the resulting matrix has the same eigenvalues as A where here, the matrix 
E (2, 2) plays the role of P. 

We do this step again, as follows. In this step, we use the elementary matrix obtained 
by adding —3 times the second row to the first row. 


" 1 

0 

1 

O 


0 

1 

0 


0 

2 

1 



33 

105 

105 

10 

28 

30 

-20 

-60 

-62 



1 

0 

1 

O 



0 

1 

0 

= 


0 

-2 

1 



1 

-3 

1 

O 


0 

1 

0 


0 

0 

1 



33 

10 

0 


-105 105 
-32 30 

0 -2 



1 

3 

1 

O 



0 

1 

0 

= 


0 

0 

1 



3 

10 

0 


0 15 

-2 30 

0 -2 


(7.4) 


Again by Lemma 7.10, this resulting matrix has the same eigenvalues as A. At this point, 
we can easily find the eigenvalues. Let 


B = 


3 0 15 

10 -2 30 

0 0-2 


Then, we find the eigenvalues of B (and therefore of A) by solving the equation det (xl — B) = 
0. You should verify that this equation becomes 

( x + 2) (x + 2) (x — 3) = 0 

Solving this equation results in eigenvalues of Ai = —2, A 2 = —2, and A 3 = 3. Therefore, 
these are also the eigenvalues of A. 

□ 


Through using elementary matrices, we were able to create a matrix for which finding 
the eigenvalues was easier than for A. At this point, you could go back to the original matrix 
A and solve (XI — A) X = 0 to obtain the eigenvectors of A. 

Notice that when you multiply on the right by an elementary matrix, you are doing the 
column operation defined by the elementary matrix. In 7.4 multiplication by the elementary 
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matrix on the right merely involves taking three times the first column and adding to the 
second. Thus, without referring to the elementary matrices, the transition to the new matrix 
in 7.4 can be illustrated by 


" 33 

-105 

105 ' 


3 

-9 

15 ' 


3 

0 

15 ' 

10 

-32 

30 


10 

-32 

30 


10 

-2 

30 

0 

0 

-2 


0 

0 

-2 


0 

0 

-2 


The third special type of matrix we will consider in this section is the triangular matrix. 
Recall Definition 3.12 which states that an upper (lower) triangular matrix contains all zeros 
below (above) the main diagonal. Remember that finding the determinant of a triangular 
matrix is a simple procedure of taking the product of the entries on the main diagonal.. It 
turns out that there is also a simple way to find the eigenvalues of a triangular matrix. 

In the next example we will demonstrate that the eigenvalues of a triangular matrix are 
the entries on the main diagonal. 


T ~ ~ ~ 1 

Example 7.12: Eigenvalues for a Triangular Matrix 

Let A = 

'12 4' 
0 4 7 

0 0 6 

. Find the eigenvalues of A. 


Solution. We need to solve the equation det (xl — A) = 0 as follows 


det (xl — A) — det 


x — 1 -2 -4 

0 x — 4 —7 

0 0 x — 6 


(x — 1) (x — 4) (x — 6) = 0 


Solving the equation (x — 1) (x — 4) (x — 6) = 0 for x results in the eigenvalues Ai = 
1, A2 = 4 and A3 = 6. Thus the eigenvalues are the entries on the main diagonal of the 
original matrix. □ 

The same result is true for lower triangular matrices. For any triangular matrix, the 
eigenvalues are equal to the entries on the main diagonal. To find the eigenvectors of a 
triangular matrix, we use the usual procedure. 

In the next section, we explore an important process involving the eigenvalues and eigen- 
vectors of a matrix. 


7.1.4. Exercises 

1. If A is an invertible n x n matrix, compare the eigenvalues of A and A~ x . More 
generally, for m an arbitrary integer, compare the eigenvalues of A and A m . 

2. If A is an n x n matrix and c is a nonzero constant, compare the eigenvalues of A and 
cA. 
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3. Let A, B be invertible n x n matrices which commute. That is, AB = BA. Suppose 
X is an eigenvector of B. Show that then AX must also be an eigenvector for B. 

4. Suppose A is an n x n matrix and it satisfies A m = A for some m a positive integer 
larger than 1. Show that if A is an eigenvalue of A then |A| equals either 0 or 1. 

5. Show that if AX = AA" and AY = A Y, then whenever k,p are scalars, 

A ( kX + pY) = A (kX + pY) 

Does this imply that kX + pY is an eigenvector? Explain. 


6. Suppose A is a 3 x 3 matrix and the following information is available. 


A 


A 


A 


0 

-1 

-1 

1 
1 
1 

-2 

-3 

-2 


= 0 


= -2 


= -2 


0 

-1 
-1 

1 
1 
1 

-2 
-3 
-2 


Find A 


1 

-4 

3 


7. Suppose A is a 3 x 3 matrix and the following information is available. 


Find A 


3 

-4 

3 




-1 ' 


" -1 

A 


-2 

= 1 

-2 



-2 


-2 



' 1 ' 


' 1 ' 

A 

1 

= 0 

1 



1 


1 


“ 

-1 ' 


" -1 

A 


-4 

= 2 

-4 



-3 


-3 
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8. Suppose A is a 3 x 3 matrix and the following information is available. 



9. Find the eigenvalues and eigenvectors of the matrix 


-6 

-92 

12 

0 

0 

0 

-2 

-31 

4 


One eigenvalue is —2. 

10. Find the eigenvalues and eigenvectors of the matrix 

' -2 -17 -6 ' 
0 0 0 

1 9 3 

One eigenvalue is 1. 

11. Find the eigenvalues and eigenvectors of the matrix 

" 9 2 8 " 

2 -6 -2 

-8 2 -5 

One eigenvalue is —3. 

12. Find the eigenvalues and eigenvectors of the matrix 


6 

76 

16 

2 

1 

to 

H 

1 

2 

64 

17 


One eigenvalue is —2. 
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13. Find the eigenvalues and eigenvectors of the matrix 

" 3 5 2 ' 

-8 -11 -4 
10 11 3 

One eigenvalue is -3. 

14. If A is the matrix of a linear transformation which rotates all vectors in M 2 through 
60°, explain why A cannot have any real eigenvalues. Is there an angle such that 
rotation through this angle would have a real eigenvalue? What eigenvalues would be 
obtainable in this way? 

15. Let A be the 2x2 matrix of the linear transformation which rotates all vectors in M 2 
through an angle of 9. For which values of 6 does A have a real eigenvalue? 

16. Is it possible for a nonzero matrix to have only 0 as an eigenvalue? 

17. Let A be the 2x2 matrix of the linear transformation which rotates all vectors in M 2 
through an angle of 9. For which values of 9 does A have a real eigenvalue? 

18. Let T be the linear transformation which reflects vectors about the x axis. Find a 
matrix for T and then find its eigenvalues and eigenvectors. 

19. Let Tbe the linear transformation which rotates all vectors in M 2 counterclockwise 
through an angle of 7 t/ 2. Find a matrix of T and then find eigenvalues and eigenvectors. 

20. Let T be the linear transformation which reflects all vectors in M 3 through the xy 
plane. Find a matrix for T and then obtain its eigenvalues and eigenvectors. 

7.2 Diagonalization 


Outcomes 


A. Determine when it is possible to diagonalize a matrix. 

B. When possible, diagonalize a matrix. 


We begin this section by recalling Definition 7.9 of similar matrices. Recall that if A, B 
are two n x n matrices, then they are similar if and only if there exists an invertible matrix 
P such that 

A = P~ l BP 

The following are important properties of similar matrices. 
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Proposition 7.13: Properties of Similarity 


Define ~ for n x n matrices A, B and C by A ~ B if A is similar to B. Then 

• A ~ A 

• If A ~ B then B ~ A 

• If A ~ B and B ~ C then A ~ C 


Proof. It is clear that A ~ A, taking P = I. 

Now, if A ~ B, then for some P invertible, 

A = P-'BP 


and so 


PAP' 1 = B 


But then 

(P- 1 ) -1 AP- 1 = B 

which shows that B ~ A by Definition 7.9. 

Now suppose A ~ B and B ~ C. Then there exist invertible matrices P, Q such that 


A = P _1 PP, B = Q~ 1 CQ 


Then, 


A = P- 1 (Q-'CQ) P = ( QP)~ 1 C ( QP ) 


showing that A is similar to C by Definition 7.9. □ 


When a matrix is similar to a diagonal matrix, the matrix is said to be diagonalizable. 
We define a diagonal matrix D as a matrix containing a zero in every entry except those 
on the main diagonal. More precisely, if dij is the ij th entry of a diagonal matrix D, then 
dij = 0 unless i = j. Such matrices look like the following. 


D 


* 0 

0 * 


where * is a number which might not be zero. 

The following is the formal definition of a diagonalizable matrix. 
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Notice that the above equation can be rearranged as A = PDP -1 . Suppose we wanted 
to compute A 100 . By diagonalizing A first it suffices to then compute (PDP~ 1 ) 100 , which 
reduces to P D 100 P _1 . This last computation is much simpler than A 100 . While this process 
is described in detail later, it provides motivation for diagonalization. 

7.2.1. Diagonalizing a Matrix 


The most important theorem about diagonalizability is the following major result. 


Theorem 7.15: Eigenvectors and Diagonalizable Matrices 


An nxn matrix A is diagonalizable if and only if there is an invertible matrix P given 
by 

P = [ Ad A 2 • • • X n } 
where the X & are eigenvectors of A. 

Moreover if A is diagonalizable, the corresponding eigenvalues of A are the diagonal 
entries of the diagonal matrix D. 


Proof. Suppose P is given as above as an invertible matrix whose columns are eigenvectors 
of A. Then P _1 is of the form 

P” 1 = 

where Wf A,- = Ski, which is the Kronecker’s 

AX, AX 2 • • • AX n ] 

A,X, A 2 A 2 ••• A n X n ] 

0 

An 

Conversely, suppose A is diagonalizable so that P~ l AP = D. Let 

P = [ Ad A 2 • • • A n ] 




Then 


P~ X AP = 


Wf 

wT 


W T 

r r n 

wf 

wT 


w 1 

r r n 

Ai 


' W? ' 

W? 

. W n . 

symbol discussed earlier. 
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where the columns are the X k and 


D = 


Ai 

0 


0 


Then 


and so 


AP = PD =[ X 1 X 2 




0 



[ AX 1 AX 2 • • • AX n ] = [ AiAd A 2 X 2 • • • X n X n ] 


showing the X k are eigenvectors of A and the X k are eigenvectors. 

□ 


We demonstrate this concept in the next example. Note that not only are the columns 
of the matrix P formed by eigenvectors, but P must be invertible so must consist of a wide 
variety of eigenvectors. We achieve this by using basic eigenvectors for the columns of P. 



Solution. By Theorem 7.15 we use the eigenvectors of A as the columns of P, and the 
corresponding eigenvalues of A as the diagonal entries of D. 

First, we will find the eigenvalues of A. To do so, we solve det (xl — A) = 0 as follows. 


( 

1 

0 

0 

1 


2 0 O' 

\ 

X 

0 1 0 

— 

1 4 -1 


\ 

0 0 1 


-2 -4 4 

/ 


This computation is left as an exercise, and you should verify that the eigenvalues are 
Ai = 2, A 2 = 2, and A3 = 6. 

Next, we need to find the eigenvectors. We first find the eigenvectors for Ai,A 2 = 2. 
Solving (2/ — A) X = 0 to find the eigenvectors, we find that the eigenvectors are 



" -2 ' 


' 1 " 

t 

1 

+ s 

0 


0 


1 
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where t, s are scalars. Hence there are two basic eigenvectors which are given by 



" -2 ' 


' 1 ' 


X\ = 

1 

,x 2 = 

0 



0 


1 


You can verify that the basic eigenvector for A 3 = 

6 is X 3 = 


0 
1 

-2 

Then, we construct the matrix P as follows. 

P=[X 1 x 2 x 3 ] = 

That is, the columns of P are the basic eigenvectors of A. Then, you can verify that 

II I 

'4 2 4 


-2 1 0 

1 0 1 

0 1-2 


P~ l = 


I 1 


1 1 _! 

4 2 4 


Thus, 


P-'AP = 


1 

to 

0 

0 

1 


1 

O 

T 1 

CM 

l 

1 4 -1 


1 0 1 

-2 -4 4 


0 1-2 


2 0 0 
0 2 0 
0 0 6 

You can see that the result here is a diagonal matrix where the entries on the main 
diagonal are the eigenvalues of A. We expected this based on Theorem 7.15. Notice that 
eigenvalues on the main diagonal must be in the same order as the corresponding eigenvectors 

in P. □ 


It is possible that a matrix A cannot be diagonalized. In other words, we cannot find an 
invertible matrix P so that P~ 1 AP = D. 

Consider the following example. 


Example 7.17: A Matrix which cannot be Diagonalized 

Let 



A = 

'll' 


0 1 


If possible, find an invertible matrix P and diagonal matrix D so that P 1 AP = D. 
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Solution. Through the usual procedure, we find that the eigenvalues of A are Ai = 1, A 2 = 1. 
To find the eigenvectors, we solve the equation (A I — A) X = 0. The matrix (A/ — A) is 
given by 

' A - 1 -1 

0 A- 1 

Substituting in A = 1, we have the matrix 


1 

1 

h -*■ 

1 

h 


i 

o 

1 

h 

i 

0 1-1 


1 

o 

o 

1 


Then, solving the equation (A I — A) X = 0 involves carrying the following augmented 
matrix to its reduced row-echelon form. 


0 
0 

Then the eigenvectors are of the form 


0 

0 




0 

0 


and the basic eigenvector is 


X 1 = 


1 

0 


In this case, the matrix A has one eigenvalue of multiplicity two, but only one basic 
eigenvector. In order to diagonalize A, we need to construct an invertible 2x2 matrix P. 
However, because A only has one basic eigenvector, we cannot construct this P. Notice that 
if we were to use X\ as both columns of P, P would not be invertible. For this reason, we 
cannot repeat eigenvectors in P. 

Hence this matrix cannot be diagonalized. □ 


Recall Definition 7.4 of the multiplicity of an eigenvalue. It turns out that we can deter- 
mine when a matrix is diagonalizable based on the multiplicity of its eigenvalues. In order 
for A to be diagonalizable, the number of basic eigenvectors associated with an eigenvalue 
must be the same number as the multiplicity of the eigenvalue. In Example 7.17, A had one 
eigenvalue A = 1 of multiplicity 2. However, there was only one basic eigenvector associated 
with this eigenvalue. Therefore, we can see that A is not diagonalizable. 

We summarize this in the following theorem. 


Theorem 7.18: Diagonalizability Condition 


An n x n matrix A is diagonalizable exactly when the number of basic eigenvectors 
associated with an eigenvalue is the same number as the multiplicity of that eigenvalue. 

You may wonder if there is a need to find P~\ since we can use Theorem 7.15 to construct 
P and D. We will see this is needed to compute high powers of matrices, which is one of the 
major applications of diagonalizability. 

Before we do so, we first discuss complex eigenvalues. 
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7 . 2 . 2 . Complex Eigenvalues 


In some applications, a matrix may have eigenvalues which are complex numbers. For 
example, this often occurs in differential equations. These questions are approached in the 
same way as above. 

Consider the following example. 


Example 7.19: A Real Matrix with Complex Eigenvalues 

Let 

'10 O' 


A = 

0 2-1 

0 1 2 


Find the eigenvalues and eigenvectors of A. 



Solution. We will first find the eigenvalues as usual by solving the following equation. 


( 

1 

h— 1 

O 

O 


0 

0 

\ 

X 

0 1 0 

— 

0 2-1 


V 

0 0 1 


0 1 2 



This reduces to (x — 1) ( x 2 — 4 x + 5 ) = 0. The solutions are Ai = 1, A2 = 2 + i and A3 = 2 — i. 

There is nothing new about finding the eigenvectors for Ai = 1 so this is left as an 
exercise. 

Consider now the eigenvalue A 2 = 2 + i. As usual, we solve the equation (XI — A) X = 0 
as given by 



O 

O 

1 


1 

0 

0 



" 0 ' 

0 1 0 

— 

0 2-1 


\X = 

0 

0 0 1 


0 1 2 

1 


0 


In other words, we need to solve the system represented by the augmented matrix 


1 + i 

0 

0 

0 ' 

0 

i 

1 

0 

0 

-1 

i 

0 


We now use our row operations to solve the system. Divide the first row by (1 + 1) and 
then take —i times the second row and add to the third row. This yields 


' 1 

0 

0 

1 

O 

0 

i 

1 

0 

0 

0 

0 

1 

O 


Now multiply the second row by —i to obtain the reduced row-echelon form, given by 


' 1 

0 

0 

1 

O 

0 

1 

— i 

0 

0 

0 

0 

1 

0 
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Therefore, the eigenvectors are of the form 


0 

i 

1 


and the basic eigenvector is given by 


AT = 


0 

i 

1 


As an exercise, verify that the eigenvectors for \ 3 = 2 — i are of the form 


0 

-i 

1 


Hence, the basic eigenvector is given by 


AT = 


0 

—i 

1 


As usual, be sure to check your answers! To verify, we check that AX 3 = (2 — i) X 3 as 
follows. 

"10 0 
0 2-1 
0 1 2 



1 

0 


1 

O 

1 


1 

O 


—i 

1 


-1 - 2 i 

2 — i 

= (2 - 0 

—i 

1 


Therefore, we know that this eigenvector and eigenvalue are correct. 


□ 


Notice that in Example 7.19, two of the eigenvalues were given by A 2 = 2 +i and A 3 = 2 —i. 
You may recall that these two complex numbers are conjugates. It turns out that whenever 
a matrix containing real entries has a complex eigenvalue A, it also has an eigenvalue equal 
to A, the conjugate of A. 


7.2.3. Exercises 


1. Find the eigenvalues and eigenvectors 

of the 

matrix 


' 5 

-18 

-32 ' 


0 

5 

4 


2 

-5 

-11 


One eigenvalue is 1. Diagonalize if possible. 
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2. Find the eigenvalues and eigenvectors of the matrix 

' -13 -28 28 ' 

4 9-8 

-4 -8 9 

One eigenvalue is 3. Diagonalize if possible. 

3. Find the eigenvalues and eigenvectors of the matrix 


89 

38 

268 

14 

2 

40 

-30 

-12 

-90 


One eigenvalue is —3. Diagonalize if possible. 

4. Find the eigenvalues and eigenvectors of the matrix 

" 1 90 O' 

0-2 0 
3 89 -2 

One eigenvalue is 1. Diagonalize if possible. 

5. Find the eigenvalues and eigenvectors of the matrix 

11 45 30 

10 26 20 
-20 -60 -44 

One eigenvalue is 1. Diagonalize if possible. 

6. Find the eigenvalues and eigenvectors of the matrix 


95 

25 

24 

-196 

-53 

-48 

-164 

-42 

-43 


One eigenvalue is 5. Diagonalize if possible. 

7. Suppose A is an n x n matrix and let V be an eigenvector such that AV = XV. Also 
suppose the characteristic polynomial of A is 

det ( xl — A) = x n + a n - i:r n-1 + • • • + a\x + a o 

Explain why 

(. A n + a n -iA n ~ l + • • • + ai A + a 0 l) V = 0 

If A is diagonalizable, give a proof of the Cayley Hamilton theorem based on this. This 
theorem says A satisfies its characteristic equation, 

A n + a n _iA n ^ 1 + • • • + o iiA + a 0 J = 0 
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8. Suppose the characteristic polynomial of an nxn matrix A is 1 — x n . Find A mn where 
m is an integer. 

9. Find the eigenvalues and eigenvectors of the matrix 

15 -24 7 " 

-6 5 -1 

-58 76 -20 

One eigenvalue is —2. Diagonalize if possible. Hint: This one has some complex 
eigenvalues. 

10. Find the eigenvalues and eigenvectors of the matrix 

15 -25 6 ' 

-13 23 -4 

-91 155 -30 

One eigenvalue is 2. Diagonalize if possible. Hint: This one has some complex 
eigenvalues. 

11. Find the eigenvalues and eigenvectors of the matrix 

" -11 -12 4 " 

8 17 -4 

-4 28 -3 

One eigenvalue is 1. Diagonalize if possible. Hint: This one has some complex 
eigenvalues. 

12. Find the eigenvalues and eigenvectors of the matrix 



One eigenvalue is —3. Diagonalize if possible. Hint: This one has some complex 
eigenvalues. 

13. Suppose A is an nxn matrix consisting entirely of real entries but a + ib is a complex 
eigenvalue having the eigenvector, X + iY Here A" and Y are real vectors. Show that 
then a — ib is also an eigenvalue with the eigenvector, X — iY. Hint: You should 
remember that the conjugate of a product of complex numbers equals the product of 
the conjugates. Here a + ib is a complex number whose conjugate equals a — ib. 
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7.3 Applications of Spectral Theory 


Outcomes 


A. Use diagonalization to find a high power of a matrix. 

B. Use diagonalization to solve dynamical systems. 


7.3.1. Raising a Matrix to a High Power 


Suppose we have a matrix A and we want to find A 50 . One could try to multiply A with itself 
50 times, but this is computationally extremely intensive (try it!). However diagonalization 
allows us to compute high powers of a matrix relatively easily. Suppose A is diagonalizable, 
so that P~ l AP = D. We can rearrange this equation to write A = PDP A 

Now, consider A 2 . Since A = PDP~ l , it follows that 

A 2 = ( PDP - 1 ) 2 = PDP-'PDP- 1 = PD 2 P ~ 1 

Similarly, 

A 3 = (PDP- 1 )* = PDP^PDP-'PDP- 1 = PD 3 P ~ 1 

In general, 

A n = (PDP~ 1 ) n = PD n P~ 1 

Therefore, we have reduced the problem to finding D n . In order to compute D n , then 
because D is diagonal we only need to raise every entry on the main diagonal of D to the 
power of n. 

Through this method, we can compute large powers of matrices. Consider the following 
example. 


r ■ ■ ■ i 

Example 7.20: Raising a Matrix to a High Power 

Let A = 

1 1 

1 

H- 1 O bO 

1 

o o 

1 1 

. Find A 50 . 


Solution. We will first diagonalize A. The steps are left as an exercise and you may wish to 
verify that the eigenvalues of A are Ai = 1, A 2 = 1, and A 3 = 2. 

The basic eigenvectors corresponding to Ai, A 2 = 1 are 



i 

O 

i 


I 

1 

Y = 

0 

1 

,Y = 

1 

0 _ 
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The basic eigenvector corresponding to A3 = 2 is 


X 3 = 


-1 

0 

1 


Now we construct P by using the basic eigenvectors of A as the columns of P. Thus 


p = [ x ! x 2 x 3 ; 


0 -1 -1 

0 1 0 

1 0 1 


Then also 


P- 1 


which you may wish to verify. 
Then, 


1 11 
0 1 0 

-1 -1 0 



1 

1 

1 ' 


2 

1 

0 ' 


" 0 

-1 

-1 ' 

P _1 AP = 

0 

1 

0 


0 

1 

0 


0 

1 

0 


-1 

-1 

0 


-1 

-1 

1 


1 

0 

1 


1 0 0 
0 10 
0 0 2 

= D 


Now it follows by rearranging the equation that 



0 -1 -1 


1 

O 

O 


1 11 

A = PDP - 1 = 

0 1 0 


0 1 0 


0 1 0 


1 0 1 


0 0 2 


-1 -1 0 


Therefore, 

A 50 = PD m P~ l 


" 0 -1 -1 ' 


1 

0 

0 

50 

1 11 ' 

0 1 0 


0 1 0 


0 1 0 

10 1 


0 0 2 


-1 -1 0 


By our discussion above, D 50 is found as follows. 


' 1 

0 

0 ' 

50 

" ^50 

0 

0 ' 

0 

1 

0 

= 

0 

j ^50 

0 

0 

0 

2 


0 

0 

2 50 
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It follows that 



l 

O 

l 

l 


1 

o 

o 

o 

lO 

T-— 1 


1 

1 

A 50 = 

0 1 0 


0 l 50 0 


0 1 0 


1 0 1 


0 0 2 50 


1 

1 

o 


2 5 ° 

-1 + 2 50 

0 

1 

1 — 2 50 

1 - 2 50 


□ 


Through diagonalization, we can efficiently compute a high power of A. Without this, 
we would be forced to multiply this by hand! 

The next section explores another interesting application of diagonalization. 


7.3.2. Raising a Symmetric Matrix to a High Power 


We already have seen how to use matrix diagonalization to compute powers of matrices. 
This requires computing eigenvalues of the matrix A, and finding an invertible matrix of 
eigenvectors P such that P~ l AP is diagonal. In this section we will see that if the matrix 
A is symmetric (see Definition 2.29), then we can actually find such a matrix P that is 
an orthogonal matrix of eigenvectors. Thus P _1 is simply its transpose P 1 , and P T AP is 
diagonal. When this happens we say that A is orthogonally diagonalizable 

In fact this happens if and only if A is a symmetric matrix as shown in the following 
important theorem. 



Proof. The complete proof is beyond this course, but to give an idea assume that A has an 
orthonormal set of eigenvectors, and let P consist of these eigenvectors as columns. Then 
P _1 = P r , and P T AP = D a diagonal matrix. But then A = PDP T , and 

A T = ( PDP t ) t = ( P t ) t D t P t = PDP t = A 


so A is symmetric. 

Now given a symmetric matrix A, one shows that eigenvectors corresponding to different 
eigenvalues are always orthogonal. So it suffices to apply the Gram-Schmidt process on the 
set of basic eigenvectors of each eigenvalue to obtain an orthonormal set of eigenvectors. □ 
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We demonstrate this in the following example. 


r ■ ■ ■ ~ ^ 

Example 7.22: Orthogonal Diagonalization of a Symmetric Matrix 

Let A = 

matrix. 

'10 0' 

0 3 1 

2 2 

0 1 3 

2 2 

. Find an orthogonal matrix P such that P T AP is a diagonal 


Solution. 

In this case, verify that the eigenvalues are 2 and 1. First we will find an eigenvector for 
the eigenvalue 2. This involves row reducing the following augmented matrix. 


1 

to 

1 

o 

0 

1 

o 

0 2-| 

1 

2 

0 

o -1 

2 — - 

Z 2 

1 

o 


The reduced row-echelon form is 


" 1 

0 

0 

1 

o 

0 

1 

-1 

0 

o 

1 

0 

0 

1 

o 


and so an eigenvector is 

" 0 " 

1 

1 

Finally to obtain an eigenvector of length one (unit eigenvector) we simply divide this vector 
by its length to yield: 

0 

1/V2 

. V v^2 _ 

Next consider the case of the eigenvalue 1. To obtain basic eigenvectors, the matrix which 
needs to be row reduced in this case is 


The reduced row-echelon form is 


1-1 0 

0 

0 

1 - 5 
x 2 

1 

2 

0 

1 

2 

1 - 


"Oil 

0 ' 


0 0 0 

0 


0 0 0 

0 


0 

0 

0 
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Therefore, the eigenvectors are of the form 


s 


—t 

t 


Note that all these vectors are automatically orthogonal to eigenvectors corresponding to 
the first eigenvalue. This follows from the fact that A is symmetric, as mentioned earlier. 
We obtain basic eigenvectors 


" 1 ' 


0 ' 

0 

0 

and 

-1 

1 


Since they are themselves orthogonal (by luck here) we do not need to use the Gram-Schmidt 
process and instead simply normalize these vectors to obtain 


" 1 ' 


0 

0 

and 

-1/V2 

0 


1A/2 


An orthogonal matrix P to orthogonally diagonalize A is then obtained by letting these basic 
vectors be the columns. 


P = 


0 1 0 
— 1 / a /2 0 l/y/2 

l/y/2 0 1 / a /2 


We verify this works. 


P T AP is of the form 


"0 -\y/2 \V2' 


o 

o 

1 



0 3 1 

1 0 0 


2 2 

0 ±V2 i^2 


0 1 3 

2 2 


0 1 0 
- 1/^2 0 l / y /2 

1/V2 0 l/y/2 


1 0 0 
= 010 
0 0 2 

which is the desired diagonal matrix. □ 


We can now apply this technique to efficiently compute high powers of a symmetric 
matrix. 


r ~ i 

Example 7.23: Powers of a Symmetric Matrix 

Let A = 

'10 0' 

0 3 1 

2 2 

0 1 3 

2 2 

. Compute A 7 . 
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Solution. We found in Example 7.22 that P T AP = D is diagonal, where 



0 10' 


'10 0" 

p = 

— 1/^/2 0 1 / 1/2 

and D = 

0 1 0 


1 / 1/2 0 1 / 1/2 _ 


0 0 2 


Thus A = PDP t and A 7 = PDP T PDP T ■ ■ ■ PDP T = PD 7 P T which gives: 



0 

1 

0 


' 1 

0 0 ' 

7 

' 0 

-IV2 

IV2' 

A 7 = 

-1/72 

0 

l/\/2 


0 

1 0 


1 

0 

0 


1/72 

0 

1/V2 . 


0 

0 2 


0 

\Vi 

IV 2 


0 

1 

0 


' 1 

0 0 


: 0 

-1^2 

|V2 = 

= 

-1/72 

0 

1 /V 2 


0 

1 0 


1 

0 

0 


1/72 

0 

1/V2 


0 

0 2 7 


0 

\V2 





0 

1 

0 


0 

- 1 V 2 

z 



= 

—1/ 1/2 

0 

1/V2 


1 

0 

0 




1 /V 2 

0 

1/V2 


0 


¥72 


'10 0 

Q £±1 2^1 

2 2 


n 2 7 -l 2 7 +l 

U 2 2 


7.3.3. Markov Matrices 

There are applications which are of great importance which feature a special type of ma- 
trix. Matrices in which the columns are non-negative numbers which sum to one are called 
Markov matrices. An important application of Markov matrices is in population migra- 
tion, as illustrated in the following definition. 


Definition 7.24: Migration Matrices 


Let n locations be denoted by the numbers 1,2, •• • ,n. Suppose it is the case that 
each year the proportion of residents in location j which move to location i is a t j . 
Also suppose no one escapes or emigrates from without these n locations. This last 
assumption requires Y2i a ij = 1> an d means that the matrix A, such that A = [a t j], is 
a Markov matrix. In this context, A is also called a migration matrix. 


Consider the following example which demonstrates this situation. 
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Example 7.25: Migration Matrix 


Let A be a Markov matrix given by 


A 


A .2 
.6 .8 


Verify that A is a Markov matrix and describe the entries of A in terms of population 
migration. 


Solution. The columns of A are comprised of non- negative numbers which sum to 1. Hence, 
A is a Markov matrix. 

Now, consider the entries of A in terms of population. The entry an = .4 is the 
proportion of residents in location one which stay in location one in a given time period. 
Entry 021 = .6 is the proportion of residents in location 1 which move to location 2 in the 
same time period. Entry 012 = -2 is the proportion of residents in location 2 which move to 
location 1. Finally, entry 022 = .8 is the proportion of residents in location 2 which stay in 
location 2 in this time period. 

Considered as a Markov matrix, these numbers are usually identified with probabilities. 
Hence, we can say that the probability that a resident of location one will stay in location 
one in the time period is .4. □ 


Observe that in Example 7.25 if there was initially say 15 thousand people in location 
1 and 10 thousands in location 2, then after one year there would be .4 x 15 + .2 x 10 = 8 
thousands people in location 1 the following year, and similarly there would be .6 x 15 + .8 x 
10 = 17 thousands people in location 2 the following year. 

More generally let X n = [x\ n ■ ■ ■x mn \ T where Xi n is the population of location i at time 
period n. We call X n the state vector at period n. In particular, we call X 0 the initial 
state vector. Letting A be the migration matrix, we compute the population in each location 
i one time period later by AX n . In order to find the population of location i after k years, we 
compute the i th component of A k X. This discussion is summarized in the following theorem. 


Theorem 7.26: State Vector 


Let A be the migration matrix of a population and let X n be the vector whose entries 
give the population of each location at time period n. Then X n is the state vector at 
period n and it follows that 

X n +i — AX n 


The sum of the entries of X n will equal the sum of the entries of the initial vector Xo- 
Since the columns of A sum to 1, this sum is preserved for every multiplication by A as 
demonstrated below. 


a ij x j 

* 3 



Consider the following example. 
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Example 7.27: Using a Migration Matrix 


Consider the migration matrix 


A = 


.6 0 .1 
.2 .8 0 
.2 .2 .9 


for locations 1, 2, and 3. Suppose initially there are 100 residents in location 1, 200 in 
location 2 and 400 in location 3. Find the population in the three locations after 1, 2, 
and 10 units of time. 


Solution. Using Theorem 7.26 we can find the population in each location using the equation 
X n+ \ = AX n . For the population after 1 unit, we calculate X\ = AXq as follows. 

Ad = AX 0 


Xu 


" .6 0 .1 ' 


" 100 " 

X‘2l 

= 

.2 .8 0 


200 

. X31 . 


.2 .2 .9 


400 


100 

180 

420 


Therefore after one time period, location 1 has 100 residents, location 2 has 180, and location 
3 has 420. Notice that the total population is unchanged, it simply migrates within the given 
locations. We find the locations after two time periods in the same way. 

A 2 = AXi 


%12 


1 

t-H 

O 


" 100 ' 

%22 

= 

.2 .8 0 


180 

. ^32 . 


.2 .2 .9 


420 


102 

164 

434 


We could progress in this manner to find the populations after 10 time periods, ffowever 
from our above discussion, we can simply calculate (A n X q)^ where n denotes the number 
of time periods which have passed. Therefore, we compute the populations in each location 
after 10 units of time as follows. 

A 10 = A 10 A 0 


XUQ 


" .6 

0 

.1 ' 

10 

' 100 ' 

2^210 

= 

.2 

.8 

0 


200 

_ ^310 


.2 

.2 

.9 


400 


115.085 829 22 
120. 130 672 44 
464. 783 498 34 


323 


Since we are speaking about populations, we would need to round these numbers to provide 
a logical answer. Therefore, we can say that after 10 units of time, there will be 115 residents 
in location one, 120 in location two, and 465 in location three. 

□ 

Suppose we wish to know how many residents will be in a certain location after a very 
long time. It turns out that if some power of the migration matrix has all positive entries, 
then there is a vector X s such that A n X 0 approaches X s as n becomes very large. Hence as 
more time passes and n increases, A n X 0 will become closer to the vector X s . 

Consider Theorem 7.26. Let n increase so that X n approaches X s . As X n becomes closer 
to X s , so too does X n+ i. For sufficiently large n, the statement X n+1 = AX n can be written 
as X s = AX S . 

This discussion motivates the following theorem. 



Note that the condition in Theorem 7.28 can be written as (/ — A)X S = 0, representing 
a homogeneous system of equations. 

Consider the following example. Notice that it is the same example as the Example 7.27 
but here it will involve a longer time frame. 



Solution. By Theorem 7.28 the steady state vector X s can be found by solving the system 
(J - A)X S = 0. 

Thus we need to find a solution to 


1 

0 

1 

o 


" .6 

0 

.1 ' 

\ 

Xls 


1 

o 

0 

1 

0 

— 

.2 

.8 

0 


x 2s 

= 

0 

0 

0 

1 


.2 

.2 

.9 

) 

CO 

Co 


0 


324 



The augmented matrix and the resulting reduced row-echelon form are given by 


0.4 

0 

-0.1 

0 ' 



' 1 

0 

-0.25 

0 " 

-0.2 

0.2 

0 

0 

— >■ • 

• — >■ 

0 

1 

-0.25 

0 

-0.2 

-0.2 

0.1 

0 



0 

0 

0 

0 


Therefore, the eigenvectors are 

" 0.25 ' 
t 0.25 
1 

The initial vector Xq is given by 

' 100 ' 

200 

400 

Now all that remains is to choose the value of t such that 


0.25t + 0.25 1 + t = 100 + 200 + 400 

Solving this equation for t yields t = ^4p. Therefore the population in the long run is given 
by 

116.666 666 666 666 7 

116.666 666 666 666 7 

466.666 666 666 666 7 

Again, because we are working with populations, these values need to be rounded. The 
steady state vector X s is given by 

" 117 ' 

117 

466 

□ 

We can see that the numbers we calculated in Example 7.27 for the populations after the 
10 t/l unit of time are not far from the long term values. 

Consider another example. 


1400 


0.25 


0.25 

1 

— 


r 

Example 7.30: Populations After 

- 1 

a Long Time 

Suppose a migration matrix is given by 





r 1 

1 

1 -| 



5 

2 

5 



1 

1 

1 


A = 

4 

4 

2 



11 

1 

3 



20 

4 

10 


Find the comparison between the populations 

in the three locations after a long time. 
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Solution. In order to compare the populations in the long term, we want to find the steady 
state vector X s . Solve 


/ 


r 1 1 1 i 

5 2 5 

\ 




'10 0' 


111 



%\ S 


" 0 ' 


0 1 0 

— 

4 4 2 



Xo_s 

= 

0 


0 0 1 


11 1 3 



%3s 


0 

V 


20 4 10 

) 




The augmented matrix and the resulting reduced row-echelon form are given by 


4 

5 

1 

4 


11 

20 


1 

2 

3 

4 

1 

4 


1 

‘5 

1 

‘2 

7 _ 

10 


0 1 




' 1 

0 

16 

19 

0 ' 

->■ • 

• -> 

0 

1 

18 

19 

0 



_ 0 

0 

0 

0 


and so an eigenvector is 

' 16 
18 
19 


Therefore, the proportion of population in location 2 to location 1 is given by j|. The 
proportion of population 3 to location 2 is given by j|. 


□ 


Eigenvalues of Markov Matrices 

The following is an important proposition. 


Proposition 7.31: Eigenvalues of a Migration Matrix 


Let A = [a*,] be a migration matrix. Then 1 is always an eigenvalue for A. 


Proof. Remember that the determinant of a matrix always equals that of its transpose. 
Therefore, 

det (xl — A) = det ((xl — A) T ^j = det (xl — A T ) 

because I 1 = I. Thus the characteristic equation for A is the same as the characteristic 
equation for A T . Consequently, A and A T have the same eigenvalues. We will show that 1 
is an eigenvalue for A T and then it will follow that 1 is an eigenvalue for A. 

Remember that for a migration matrix, JT = 1. Therefore, if A T = [bij\ with bij = ciji, 
it follows that 

ctji = i 

3 3 
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Therefore, from matrix multiplication, 


1 


Yjj hj 


i 

. 

i 

1 

t-H 

1 


1 

• • -Oj 

W 


i 


1 


Notice that this shows that 


is an eigenvector for A T corresponding to the eir 


1 

value, A = 1. As explained above, this shows that A = 1 is an eigenvalue for A because A 
and A T have the same eigenvalues. □ 


7.3.4. Dynamical Systems 


The migration matrices discussed above give an example of a discrete dynamical system. We 
call them discrete because they involve discrete values taken at a sequence of points rather 
than on a continuous interval of time. 

An example of a situation which can be studied in this way is a predator prey model. 
Consider the following model where x is the number of prey and y the number of predators 
in a certain area at a certain time. These are functions of n G N where n — 1, 2, • • • are the 
ends of intervals of time which may be of interest in the problem. In other words, x (n) is 
the number of prey at the end of the n th interval of time. An example of this situation may 
be modeled by the following equation 


x (in T 1) 


" 2 -3 ' 

x ( n ) 

. y (n + 1) _ 


1 4 

. y ( n ) _ 


This says that from time period n to n + 1, x increases if there are more x and decreases as 
there are more y. In the context of this example, this means that as the number of predators 
increases, the number of prey decreases. As for y, it increases if there are more y and also if 
there are more x. 

This is an example of a matrix recurrence which we define now. 



In this section, we will examine how to find solutions to a dynamical system given certain 
initial conditions. This process involves several concepts previously studied, including matrix 
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diagonalization and Markov matrices. The procedure is given as follows. Recall that when 
diagonalized, we can write A n = PD n P _1 . 


Procedure 7.33: Solving a Dynamical System 


Suppose a dynamical system is given by 

-I‘n ) 1 UX n T b]Jn 
V n ■ 1 cx n -|- dy n 

Given initial conditions x 0 and y 0 , the solutions to the system are found as follows: 

1. Express the dynamical system in the form V n+ \ = AV n . 

2. Diagonalize A to be written as A = PDF 1 . 

3. Then V n = PD n P ~ 1 \ / q where Vo is the vector containing the initial conditions. 

4. If given specific values for n, substitute into this equation. Otherwise, find a 
general solution for n. 

We will now consider an example in detail. 


Example 7.34: Solutions of a Discrete Dynamical System 


Suppose a dynamical system is given by 

Xn- i-i 1.5x n 0.5y n 

Un ) 1 1.0x n 

Express this system as a matrix recurrence and hncl solutions to the dynamical system 
for initial conditions Xq = 20, yo = 10. 


Solution. First, we express the system as a matrix recurrence. 


V n +i — AV n 


x (■ n + 1) 


' 1.5 

-0.5 ' 


x (■ n 

y {n + 1) 


1.0 

0 


V (n 


Then 


A 


1.5 -0.5 

1.0 0 


You can verify that the eigenvalues of A are 1 and .5. By diagonalizing, we can write A in 
the form 


P~ 1 DP 


1 


i 

o 

1 

1 2 


i 

o 

.5 
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Now given an initial condition 



the solution to the dynamical system is given by 

V n = PD n p- x V { 3 


x ( n ) 


'll' 


' 1 O' 

n 


2 -1 ' 



x 0 

y(n) _ 


1 2 


0 .5 



-1 1 



yo _ 



'll' 


' 1 

o ' 


2 -1 



x 0 



1 2 


0 (.5) 

n 


-1 1 



yo 


y 0 ((.5) n -l)-* 0 ((.5) B -2) 

^ .|fo(2(.5) n -l)-*o(2(.5) B -2) 

If we let n become arbitrarily large, this vector approaches 


2x 0 - Do 
2x 0 - Vo 


Thus for large n, 


x ( n ) 

y(n) 


2x 0 - Vo 
2x 0 - y 0 


Now suppose the initial condition is given by 


Xo 


' 20 ' 

. yo . 


10 


Then, we can find solutions for various values of n. Here are the solutions for values of 
n between 1 and 5 


n — 1 : 


25.0 

20.0 


, n — 2 : 


27.5 

25.0 


, n — 3 : 


28.75 

27.5 



' 29.375 ' 


' 29.688 ' 

n — 4 : 

28.75 

, n — 5 : 

29.375 


Notice that as n increases, we approach the vector given by 


0 

1 

O 

H 

CN 


' 2 (20) - 

10 ' 


' 30 ' 

2x 0 - y 0 _ 


2 (20) - 

10 


30 


These solutions are graphed in the following figure. 


29 

28 

27 


28 29 30 
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x 



□ 

The following example demonstrates another system which exhibits some interesting 
behavior. When we graph the solutions, it is possible for the ordered pairs to spiral around 
the origin. 


Example 7.35: Finding Solutions to a Dynamical System 


Suppose a dynamical system is of the form 


x (n + 1) 


0.7 

0.7 ' 


x (■ n ) 

_y(n + 1) _ 


-0.7 

0.7 


_y(n) _ 


Find solutions to the dynamical system for given initial conditions. 


Solution. Let 


0.7 0.7 
-0.7 0.7 


To find solutions, we must diagonalize A. You can verify that the eigenvalues of A are 
complex and are given by Ai = .7+ .7 i and A 2 = .7— .7 i. The eigenvector for Ai = .7 + .7 i is 


1 

i 


and that the eigenvector for A 2 = .7 — Hi is 


1 

—i 


Thus the matrix A can be written in the form 


and so, 


1 1 

i —i 


.7+ .7 i 
0 



■ i 

1 


2 



1 



2 

2 1 


V n = P D n P~ 1 V 0 


x (n) 


' 1 1 ' 

(.7 + .7i) n 0 


_y(n) _ 


i —i 

0 

1 

3 



x 0 

Vo 


The explicit solution is given by 

' xq (| (0.7 - 0.7 i) n + | (0.7 + 0.7i) n ) + y 0 (\i (0.7 - 0.7 i) n - \i (0.7 + 0.7i) n ) 
_ y 0 (| (0.7 - 0.7 i) n + i (0.7 + 0.7 i) n ) - x Q (±i (0.7 - 0.7 i) n - \i (0.7 + 0.7 i) n ) 


Suppose the initial condition is 


X 0 


i 

h- * 1 

o 

1 

. y° . 


10 


Then one obtains the following sequence of values which are graphed below by letting n = 

1 , 2, •••,20 
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In this picture, the dots are the values and the dashed line is to help to picture what is 
happening. 

These points are getting gradually closer to the origin, but they are circling the origin in 


the clockwise direction as they do so. As n increases, the vector 


approaches 


This type of behavior along with complex eigenvalues is typical of the deviations from an 
equilibrium point in the Lotka Volterra system of differential equations which is a famous 
model for predator-prey interactions. These differential equations are given by 

x' = x (a — by) 

V = -y (c - dx) 

where a, b, c, d are positive constants. For example, you might have X be the population of 
moose and Y the population of wolves on an island. 

Note that these equations make logical sense. The top says that the rate at which 
the moose population increases would be aX if there were no predators Y. However, this is 
modified by multiplying instead by (a — bY) because if there are predators, these will militate 
against the population of moose. The more predators there are, the more pronounced is this 
effect. As to the predator equation, you can see that the equations predict that if there 
are many prey around, then the rate of growth of the predators would seem to be high. 
However, this is modified by the term — cY because if there are many predators, there would 
be competition for the available food supply and this would tend to decrease Y' . 

The behavior near an equilibrium point, which is a point where the right side of the 
differential equations equals zero, is of great interest. In this case, the equilibrium point is 


X= d' V= b 

Then one defines new variables according to the formula 

c a 

x + - = x, y = y + - 
d b 
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In terms of these new variables, the differential equations become 


x ' = (* + 5) (“ - 6 (" + 5 


y = 


Multiplying out the right sides yields 


v + l) ( c ~ d i x + l )) 


x' = —bxy — b—y 
a 

/ , a 1 

y = dxy + —ax 
b 

The interest is for x, y small and so these equations are essentially equal to 

x y = r dx 

a b 

Replace x' with the difference quotient x d+h)-£(t) w here h is a small positive number 
and y' with a similar difference quotient. For example one could have h correspond to one 
day or even one hour. Thus, for h small enough, the following would seem to be a good 
approximation to the differential equations. 

Q 

x (t + h) = x (■ t ) — hb—y 

Lb 

y(t + h) = y (t) + h-dx 

b 

Let 1,2,3,- •• denote the ends of discrete intervals of time having length h chosen above. 
Then the above equations take the form 


x ( n + 1) 

y {n + 1 ) 


Note that the eigenvalues of this matrix are always complex. 

We are not interested in time intervals of length h for h very small. Instead, we are 
interested in much longer lengths of time. Thus, replacing the time interval with mh , 


x ( n + m) 
y (■ n + m) 


-1 hx 

1 d 

had 1 

b 1 


For example, if m = 2, you would have 


x {n + 2) 

y [n + 2 ) 


1 — ach 2 —2 b^h 
2 jdh 1 — ach 2 

b 


Note that most of the time, the eigenvalues of the new matrix will be complex. 
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You can also notice that the upper right corner will be negative by considering higher 
powers of the matrix. Thus letting 1, 2, 3, • • • denote the ends of discrete intervals of time, 
the desired discrete dynamical system is of the form 


x ( n + 1) 


a 

-b ' 


x (■ n ) 

_y(n + 1) _ 


c 

d 


_y(n) _ 


where a, 6, c, d are positive constants and the matrix will likely have complex eigenvalues 
because it is a power of a matrix which has complex eigenvalues. 

You can see from the above discussion that if the eigenvalues of the matrix used to define 
the dynamical system are less than 1 in absolute value, then the origin is stable in the sense 
that as n — » oo, the solution converges to the origin. If either eigenvalue is larger than 1 in 
absolute value, then the solutions to the dynamical system will usually be unbounded, unless 
the initial condition is chosen very carefully. The next example exhibits the case where one 
eigenvalue is larger than 1 and the other is smaller than 1. 

The following example demonstrates a familiar concept as a dynamical system. 



Solution. This sequence is extremely important in the study of reproducing rabbits. It can 
be considered as a dynamical system as follows. Let y (■ n ) = x (n + 1) . Then the above 
recurrence relation can be written as 

x (n + 1) 
y(n+ 1) 



0 1 

x (■ n ) 


x (0) 


1 


1 1 

y(n) 

5 

y (o) 


1 


Let 


A 


0 1 
1 1 


The eigenvalues of the matrix A are Ai = \ — hVE and A 2 = ^\/5 + |. The corresponding 
eigenvectors are, respectively, 






Y, = 

1 

t 

t 

1 

,y 2 = 

1 

t 

t 

1 


You can see from a short computation that one of the eigenvalues is smaller than 1 in 
absolute value while the other is larger than 1 in absolute value. Now, diagonalizing A gives 
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" | ' 

-l 

' 01 ' 


' 1 ^ 5-1 — J \/5 — | ' 

1 1 


1 1 


1 1 


|V5 + i 0 

0 I - ix/5 


Then it follows that for a given initial condition, the solution to this dynamical system 
is of the form 


x ( n ) 



1 

r 

{\Vs + \) n o 

. y (n) _ 


1 

1 

- 

o (Hysn 



' 5^5 



r i ' 



-5^5 


) 

[ i 


It follows that 



Here is a picture of the ordered pairs ( x (n) ,y(n )) for n — 0, 1, • • • , n. 


□ 


40 


20 


0^» 

0 10 20 30 


There is so much more that can be said about dynamical systems. It is a major topic of 
study in differential equations and what is given above is just an introduction. 


7.3.5. Exercises 


1. Let A = 


1 2 
2 1 


Diagonalize A to find A 10 . 
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2. Let A 


. Diagonalize A to find A 50 . 


1 4 1 
0 2 5 
0 0 5 


3. Let A = 


1 -2 -1 
2-1 1 
-2 3 1 


Diagonalize A to find A 100 . 


4. The following is a Markov (migration) matrix for three locations 


r l_ l l -] 

10 9 5 

J_ 7 2 

10 9 5 

112 
5 9 5 


(a) Initially, there are 90 people in location 1, 81 in location 2, and 85 in location 3. 
How many are in each location after one time period? 

(b) The total number of individuals in the migration process is 256. After a long 
time, how many are in each location? 

5. The following is a Markov (migration) matrix for three locations 

1 1 2-i 

5 5 5 

2 2 1 

5 5 5 

2 2 2 
5 5 5 

(a) Initially, there are 130 individuals in location 1, 300 in location 2, and 70 in 
location 3. How many are in each location after two time periods? 

(b) The total number of individuals in the migration process is 500. After a long time, 
how many are in each location? 

6. The following is a Markov (migration) matrix for three locations 

r _3_ 3 i-i 

10 8 3 

J_ 3 1 

10 8 3 

3 11 

5 4 3 

The total number of individuals in the migration process is 480. After a long time, 
how many are in each location? 
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7. The following is a Markov (migration) matrix for three locations 

r _3_ i i-i 

10 3 5 

_3_ 1 7 _ 

10 3 10 

2 1 J_ 

5 3 10 

The total number of individuals in the migration process is 1155. After a long time, 
how many are in each location? 

8. The following is a Markov (migration) matrix for three locations 

r 2 j_ i -I 

5 10 8 

_3_ 2 5 

10 5 8 

All 

10 2 4 

The total number of individuals in the migration process is 704. After a long time, how 
many are in each location? 

9. You own a trailer rental company in a large city and you have four locations, one in 
the South East, one in the North East, one in the North West, and one in the South 
West. Denote these locations by SE,NE,NW, and SW respectively. Suppose that the 
following table is observed to take place. 



SE 

NE 

NW 

SW 

SE 

i 

i 

l 

1 

3 

10 

10 

5 

| NE 

1 1 

1 7 

1 1 

1 1 1 

1 3 

1 10 

1 5 

1 10 1 

NW 

1 2 

1 9 

1 1 

1 10 

1 3 

1 5 

1 1 1 

1 5 1 

SW 

1 

1 

1 

1 

9 

10 

10 

2 


In this table, the probability that a trailer starting at NE ends in NW is 1/10, the 
probability that a trailer starting at SW ends in NW is 1/5, and so forth. Approxi- 
mately how many will you have in each location after a long time if the total number 
of trailers is 413? 

10. You own a trailer rental company in a large city and you have four locations, one in 
the South East, one in the North East, one in the North West, and one in the South 
West. Denote these locations by SE,NE,NW, and SW respectively. Suppose that the 
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following table is observed to take place. 



SE 

NE 

NW 

SW 

SE 

i 

i 

l 

1 


7 

4 

10 

5 

NE 

2 

1 

1 

1 1 


7 

4 

1 5 

10 I 

NW 

1 1 

1 7 

1 1 

4 

1 3 

1 5 

1 1 1 

1 5 1 

SW 

3 

7 

1 

4 

1 

10 

1 

2 


In this table, the probability that a trailer starting at NE ends in NW is 1/10, the 
probability that a trailer starting at SW ends in NW is 1/5, and so forth. Approxi- 
mately how many will you have in each location after a long time if the total number 
of trailers is 1469. 

11. The following table describes the transition probabilities between the states rainy, 
partly cloudy and sunny. The symbol p.c. indicates partly cloudy. Thus if it starts off 
p.c. it ends up sunny the next day with probability |. If it starts off sunny, it ends up 
sunny the next day with probability | and so forth. 

rains sunny p.c. 
rains \ f \ 

sunny \ § \ 

pc - - - 

5 5 3 

Given this information, what are the probabilities that a given day is rainy, sunny, or 
partly cloudy? 

12. The following table describes the transition probabilities between the states rainy, 
partly cloudy and sunny. The symbol p.c. indicates partly cloudy. Thus if it starts off 
p.c. it ends up sunny the next day with probability A. If it starts off sunny, it ends 
up sunny the next day with probability | and so forth. 



rams 

sunny 

p.c. 


l 

l 

1 

rains 

5 

5 

3 


i 

2 

4 

sunny 

10 

5 

9 


7 

2 

2 

p.c. 

10 

5 

9 


Given this information, what are the probabilities that a given day is rainy, sunny, or 
partly cloudy? 
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13. You own a trailer rental company in a large city and yon have four locations, one in 
the South East, one in the North East, one in the North West, and one in the South 
West. Denote these locations by SE,NE,NW, and SW respectively. Suppose that the 
following table is observed to take place. 



SE 

NE 

NW 

SW 

SE 

5 

i 

l 

1 

11 

10 

10 

5 

NE 

1 

7 

1 

1 

11 

10 

5 

10 

NW 

2 

11 

1 

10 

3 

5 

1 

5 

SW 

3 

1 

1 

1 

11 

10 

10 

2 


In this table, the probability that a trailer starting at NE ends in NW is 1/10, the 
probability that a trailer starting at SW ends in NW is 1/5, and so forth. Approxi- 
mately how many will you have in each location after a long time if the total number 
of trailers is 407? 


14. The University of Poohbah offers three degree programs, scouting education (SE), 
dance appreciation (DA), and engineering (E). It has been determined that the prob- 
abilities of transferring from one program to another are as in the following table. 



SE 

DA 

E 

SE 

.8 

.1 

.3 

DA 

.1 

.7 

.5 

E 

.1 

.2 

.2 


where the number indicates the probability of transferring from the top program to 
the program on the left. Thus the probability of going from DA to E is .2. Find the 
probability that a student is enrolled in the various programs. 


15. In the city of Nabal, there are three political persuasions, republicans (R), democrats 
(D), and neither one (N). The following table shows the transition probabilities between 
the political parties, the top row being the initial political party and the side row being 
the political affiliation the following year. 


R 

D 

N 


R D 

i i 

5 6 

1 1 

5 3 

3 1 

5 2 


N 

2 

7 

4 

7 

1 

7 


Find the probabilities that a person will be identified with the various political per- 
suasions. Which party will end up being most important? 
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16. The following table describes the transition probabilities between the states rainy, 
partly cloudy and sunny. The symbol p.c. indicates partly cloudy. Thus if it starts off 
p.c. it ends up sunny the next day with probability |. If it starts off sunny, it ends up 
sunny the next day with probability | and so forth. 

rains 
sunny 
p.c. 

Given this information, what are the 
partly cloudy? 


rains sunny p.c. 

12 5 

5 7 9 

12 1 

5 7 3 

3 3 1 

5 7 9 


probabilities that a given day is rainy, sunny, or 


7.4 Orthogonality 


7.4.1. Orthogonal Diagonalization 


We begin this section by recalling some important definitions. Recall from Definition 4.93 
that non-zero vectors are called orthogonal if their dot product equals 0. A set is orthonormal 
if it is orthogonal and each vector is a unit vector. 

An orthogonal matrix U, from Definition 4.97, is one in which UU T = I. In other 
words, the transpose of an orthogonal matrix is equal to its inverse. A key characteristic 
of orthogonal matrices, which will be essential in this section, is that the columns of an 
orthogonal matrix form an orthonormal set. 

We now recall another important definition. 



Before proving an essential theorem, we first examine the following lemma which will be 
used below. 
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Proof. This result follows from the definition of the dot product together with properties of 
matrix multiplication, as follows: 


Ax • y 


Y. akfXWk 

k,l 

J ~2{aik) T xiy k 
k,l 

x • A T y 
x • Ay 


The last step follows from A T = A, since A is symmetric. □ 


We can now prove that the eigenvalues of a real symmetric matrix are real numbers. 
Consider the following important theorem. 


Theorem 7.39: Orthogonal Eigenvectors 


Let A be a real symmetric matrix. Then the eigenvalues of A are real numbers and 
eigenvectors corresponding to distinct eigenvalues are orthogonal. 


Proof. Recall that for a complex number a + ib, the complex conjugate, denoted by a + ib 
is given by a + ib = a — ib. The notation, x will denote the vector which has every entry 
replaced by its complex conjugate. 

Suppose A is a real symmetric matrix and Ax = AT. Then 


AT T = (AT) x = x A T x = x Ax = Xx x 


—t 


. -gr _ 


—T _ — 

Dividing by T T on both sides yields A = A which says A is real. To do this, we need to 
ensure that T x 0. Notice that x x — 0 if and only if x — 0. Since we chose x such that 
Ax = Xx, x is an eigenvector and therefore must be nonzero. 

Now suppose A is real symmetric and Ax = Xx, Ay = py where p ^ X. Then since A is 
symmetric, it follows from Lemma 7.38 about the dot product that 


A x • y = Ax •y — x • Ay = x • py = px • y 

Hence (A — p) x* y — 0. It follows that, since A — p ^ 0, it must be that x»y = 0. Therefore 
the eigenvectors form an orthogonal set. □ 


The following theorem is proved in a similar manner. 



Proof. First, note that if A = 0 is the zero matrix, then A is skew symmetric and has 
eigenvalues equal to 0. 
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Suppose A = —A T so A is skew symmetric and Ax = Xx. Then 

-t-^T _ /~r=i\T - ~ T aT^ — ' t 

Xx x = Ax ) x = x A x = —x Ax = —Xx x 

—T — 

and so, dividing by x x as before, A = —A. Letting A = a + ib, this means a — ib = —a — ib 
and so a = 0. Thus A is pure imaginary. □ 


Consider the following example. 


Example 7.41: Eigenvalues of a Skew Symmetric Matrix 

Let A = 

' 0 -1 ' 
1 0 

. Find its eigenvalues. 


Solution. First notice that A is skew symmetric. By Theorem 7.40, the eigenvalues will 
either equal 0 or be pure imaginary. The eigenvalues of A are obtained by solving the usual 
equation 


det(xl — A) = det 


x 1 
— 1 x 


x 2 + 1 = 0 


Lienee the eigenvalues are ±i, pure imaginary. 


□ 


Consider the following example. 


Example 7.42: Eigenvalues of a Symmetric Matrix 

Let A = 

' 12 ' 
2 3 

. Find its eigenvalues. 


Solution. First, notice that A is symmetric. By Theorem 7.39, the eigenvalues will all be 
real. The eigenvalues of A are obtained by solving the usual equation 


det(x/ — A) = det 


x — 1 -2 

-2 x — 3 


= x 2 — 4x — 1 = 0 


The eigenvalues are given by Ai = 2 + \/5 and A 2 = 2 — \/5 which are both real. 


□ 


Recall that a diagonal matrix D = [dij\ is one in which dij = 0 whenever i ^ j. In other 
words, all numbers not on the main diagonal are equal to zero. 

Consider the following important theorem. 
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We can use this theorem to diagonalize a symmetric matrix, using orthogonal matrices. 
Consider the following corollary. 



Proof. Since A is symmetric, then by Theorem 7.43, there exists an orthogonal matrix U 
such that U T AU = D, a diagonal matrix whose diagonal entries are the eigenvalues of A. 
Therefore, since A is symmetric and all the matrices are real, 

D = W = U T A T U = U T A T U = U T AU = D 


showing D is real because each entry of D equals its complex conjugate. 
Now let 

U = [ui u 2 ■ ■■ u n ] 
where the fq denote the columns of U and 


The equation, U T AU 


Ai 


D = 


0 

D implies AU = UD and 

AU = [ Aui Au 2 
= [ A i u i \ 2 u 2 

= UD 


0 


Au n ] 

A n^n ] 


where the entries denote the columns of AU and UD respectively. Therefore, Aui = A jfq. 
Since the matrix U is orthogonal, the ij th entry of U T U equals S t j and so 

Sij = ujuj = Hi • Uj 

This proves the corollary because it shows the vectors {Hi} form an orthonormal set. □ 


r # # ^ 

Example 7.45: Find an Orthonormal Set of Eigenvectors 

Find an orthonormal set of eigenvect 

A = 

ors for the symn 

17 -2 -2 ' 
-2 6 4 

-2 4 6 

netric matrix 
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Solution. Recall Procedure 7.5 for finding the eigenvalues and eigenvectors of a matrix. You 
can verify that the eigenvalues are 18, 9, 2. First find the eigenvector for 18 by solving the 
equation (18 1 — A)X = 0. The appropriate augmented matrix is given by 


"18-17 

2 

2 

1 

O 

2 

18-6 

-4 

0 

2 

-4 

18-6 

0 _ 


The reduced row-echelon form is 


1 

0 

4 

1 

o 

0 

1 

-1 

0 

o 

1 

0 

0 

1 

o 


Therefore an eigenvector is 

" -4 
1 
1 


Next find the eigenvector for A = 9. The augmented matrix and resulting reduced row-echelon 
form are 


"9-17 2 2 

0 ' 


'1 o -J 

0 ' 

2 9-6 -4 

0 


0 1 -1 

0 

2 -4 9-6 

0 


.00 0 

0 . 


Thus an eigenvector for A = 9 is 

" 1 ' 

2 

2 

Finally find an eigenvector for A = 2. The appropriate augmented matrix and reduced row- 
echelon form are 


1 

to 

1 

2 

2 

0 ' 



' 1 

0 

0 

0 ' 

2 

2-6 

-4 

0 

— >■ • 

• -»■ 

0 

1 

1 

0 

2 

-4 

2-6 

0 



0 

0 

0 

0 


Thus an eigenvector for A = 2 is 

0 

-1 

1 

The set of eigenvectors for A is given by 



You can verify that these eigenvectors form an orthogonal set. By dividing each eigenvector 
by its magnitude, we obtain an orthonormal set: 


1 

Vl8 


-4 

1 

1 


1 

’3 



0 

-1 

1 
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Consider the following example. 


r i 

Example 7.46: Repeated Eigenvalues 

Find an orthonormal set of three eigei 

A = 

ivectors for t.h 

"10 2 2 ' 
2 13 4 

2 4 13 

e matrix 


Solution. You can verify that the eigenvalues of A are 9 (with multiplicity two) and 18 
(with multiplicity one). Consider the eigenvectors corresponding to A = 9. The appropriate 
augmented matrix and reduced row-echelon form are given by 


1 

CO 

1 

h -L 

o 

-2 

-2 

0 ' 



' 1 

2 

2 

0 " 

-2 

9-13 

-4 

0 

->■ • 

• -> 

0 

0 

0 

0 

-2 

-4 

9-13 

0 



0 

0 

0 

0 


and so eigenvectors are of the form 


—2 y — 2 z 

y 

z 


We need to find two of these which are orthogonal. Let one be given by setting z 


V = 1, giving 


-2 

1 

0 


In order to find an eigenvector orthogonal to this one, we need to satisfy 


0 and 



—2 y — 2 z 

y 

z 


= 5y + Az = 0 


The values y = —4 and z — 5 satisfy this equation, giving another eigenvector corresponding 
to A = 9 as 


~ -2 (-4) - 2 (5) " 


" -2 ' 

(- 4 ) 

= 

-4 

5 


5 


Next find the eigenvector for A = 18. The augmented matrix and the resulting reduced 
row-echelon form are given by 


"18-10 

-2 

-2 

0 " 



' 1 

0 

1 

2 

0 ' 

-2 

18-13 

-4 

0 

-> • 

• ->■ 

0 

1 

-1 

0 

-2 

-4 

18-13 

0 



. 0 

0 

0 

0 . 


344 


and so an eigenvector is 


1 

2 

2 

Dividing each eigenvector by its length, the orthonormal set is 



□ 

In the above solution, the repeated eigenvalue implies 
other orthonormal bases which could have been obtained. 

1, we could just as easily have taken y = 0 or even y = z 
resulted in a different orthonormal set. 

Recall the following definition. 


Definition 7.47: Diagonalizable 


An n x n matrix A is said to be non defective or diagonalizable if there exists an 
invertible matrix P such that P~ l AP = D where D is a diagonal matrix. 


that there would have been many 
While we chose to take z = 0, y = 
= 1. Any such change would have 


As indicated in Theorem 7.43 if A is a real symmetric matrix, there exists an orthogonal 
matrix U such that U T AU = D where D is a diagonal matrix. Therefore, every symmetric 
matrix is diagonalizable because if U is an orthogonal matrix, it is invertible and its inverse is 
U 1 . In this case, we say that A is orthogonally diagonalizable. In the following example, 
this orthogonal matrix U will be found. 


r ; ; i 

Example 7.48: Diagonalize a Symmetric Matrix 

Let A = 

matrix. 

'10 0' 

0 3 1 

2 2 

0 1 3 

2 2 

. Find an orthogonal matrix U such that U T AU is a diagonal 


Solution. In this case, the eigenvalues are 2 (with multiplicity one) and 1 (with multiplicity 
two). First we will find an eigenvector for the eigenvalue 2. The appropriate augmented 
matrix and resulting reduced row-echelon form are given by 


0 

l 

1 

o o 



' 1 

0 

0 

0 ' 

2 

->■ • 

• ->■ 

0 

1 

-1 

0 

2 — - 

Z 2 

0 



0 

0 

0 

0 
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and so an eigenvector is 


0 

1 

1 

However, it is desired that the eigenvectors be unit vectors and so dividing this vector by its 
length gives 

' 0 ' 

l 

V2 

1 

. . 

Next find the eigenvectors corresponding to the eigenvalue equal to 1. The appropriate 
augmented matrix and resulting reduced row-echelon form are given by: 


'1-1 0 0 

o ' 




0 1-| -i 

o 


'oil 

0 ' 

2 2 



0 0 0 

0 

0 -- 1 — 4 

u 2 2 | 

0 


0 0 0 

0 


Therefore, the eigenvectors are of the form 

s 

—t 

t 


Two of these which are orthonormal are 


1 

0 

0 


, choosing s = 1 and t = 0, and 


0 

i 

V2 

1 

V2 


letting s — 0, t — 1 and normalizing the resulting vector. 

To obtain the desired orthogonal matrix, we let the orthonormal eigenvectors computed 
above be the columns. 


0 

i 

'£ 


0 

1 

£ 

\/2 


To verify, compute U T AU as follows: 


U 1 AU = 


0 - 

1 
0 


\/2 

0 

1 

\/2 


1 

V2 

0 

1 



' 1 

0 

0 ' 

r 


0 

3 

2 

1 

2 



0 

1 

2 

3 

2 

- 


0 

1 

f 

\/2 


0 

1 

f 

V2 j 


1 0 0 
0 1 0 
0 0 2 


= D 


the desired diagonal matrix. Notice that the eigenvectors, which construct the columns of 
U, are in the same order as the eigenvalues in D. □ 
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7.4.2. Positive Definite Matrices 


7.4.3. QR Factorization 


In this section, a reliable factorization of matrices is studied. Called the QR factorization of 
a matrix, it always exists. While much can be said about the QR factorization, this section 
will be limited to real matrices. Therefore we assume the dot product used below is the 
usual dot product. We begin with a definition. 


Definition 7.49: QR Factorization 


Let A be a real m x n matrix. Then a QR factorization of A consists of two matrices, 
Q orthogonal and R upper triangular, such that A = QR. 


The following theorem claims that such a factorization exists. 


Theorem 7.50: Existence of QR Factorization 


Let A be any real mxn matrix with linearly independent columns. Then there exists 
an orthogonal matrix Q and an upper triangular matrix R having non-negative entries 
on the main diagonal such that 

A = QR 


The procedure for obtaining the QR factorization for any matrix A is as follows. 
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Procedure 7.51: QR Factorization 


Let A be an m x n matrix given by A = [ Ai A 2 ■ ■ ■ A n ] where the A{ are the 
linearly independent columns of A. 


1. Apply the Gram- Schmidt Process 4.102 to the columns of A, writing Bi for the 
resulting columns. 


2. Normalize the Bi, to hnd Ci = jy^B t . 

3. Construct the orthogonal matrix Q as Q = [ C x C 2 ■ • • C n ] . 

4. Construct the upper triangular matrix R as 


R = 


B i|| A 2 • C\ A$ • C\ 
0 1 1 B 2 1 1 A% • C 2 

0 0 ||B 3 || 


■An • 

A n • C 2 
A n • C 3 


0 0 0 



5. Finally, write A = QR where Q is the orthogonal matrix and R is the upper 
triangular matrix obtained above. 


Notice that Q is an orthogonal matrix as the Ci form an orthonormal set. Since ||hh|| > 0 
for all i (since the length of a vector is always positive), it follows that R is an upper triangular 
matrix with positive entries on the main diagonal. 

Consider the following example. 



Solution. First, observe that A x , A 2 , the columns of A, arc linearly independent. Therefore 
we can use the Gram-Schmidt Process to create a corresponding orthogonal set {B 1 , B 2 j as 
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follows: 


B\ 

B 2 


A 1 = 

A-2 ~ 

' 2 ' 
1 
0 


1 

0 

1 


A 2 • B i 

JW 

2 

~~ 2 


B i 

1 ' 

0 

1 


1 

1 

-1 


Normalize each vector to create the set {Ci,^} as follows: 


Ci = 


C 2 = 


\Bi\ 

1 


I Bo 


■Bo = 


Now construct the orthogonal matrix Q as 

Q = [ C\ C 2 


r j_ 

V2 


1 


J_ 1 

V3 

1 

y/3 

1 

Vs 


1 

' i " 

0 

i 

71 

i 

i 

i 

-i 

71 


c n ; 


Finally, construct the upper triangular matrix R as 

to _ ||Bi|| A 2 • Ci 

[ 0 || B 2 

' V2 V2 ' 

0 y/3 _ 

It is left to the reader to verify that A = QR. 


□ 


The QR Factorization and Eigenvalues 

The QR factorization of a matrix has a very useful application. It turns out that it can be 
used repeatedly to estimate the eigenvalues of a matrix. Consider the following procedure. 
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Procedure 7.53: Using the QR Factorization to Estimate Eigenvalues 


Let A be an invertible matrix. Define the matrices Ai, A 2 , ■■■ as follows: 

1. A\ — A factored as A\ = Q\R\ 

2. A 2 = RiQi factored as A 2 = Q 2 R 2 

3. A 3 = R 2 Q 2 factored as A 3 = Q 3 R 3 

Continue in this manner, where in general A = QkRk and A^ + i = RkQk- 
Then it follows that this sequence of A t converges to an upper triangular matrix which 
is similar to A. Therefore the eigenvalues of A can be approximated by the entries on 
the main diagonal of this upper triangular matrix. 


7.4.4. Quadratic Forms 


One of the applications of orthogonal diagonalization is that of quadratic forms and graphs 
of level curves of a quadratic form. This section has to do with rotation of axes so that 
with respect to the new axes, the graph of the level curve of a quadratic form is oriented 
parallel to the coordinate axes. This makes it much easier to understand. For example, we 
all know that x\ + x\ = 1 represents the equation in two variables whose graph in M 2 is a 
circle of radius 1. But how do we know what the graph of the equation 5xf + Ax\x 2 + 3xl = 1 
represents? 

We first formally define what is meant by a quadratic form. In this section we will work 
with only real quadratic forms, which means that the coefficients will all be real numbers. 


Definition 7.54: Quadratic Form 


A quadratic form is a polynomial of degree two in n variables X\ , x 2 , • • • , x n , written 
as a linear combination of x'j terms and XiXj terms. 


Consider the quadratic form q = a\\x\ + a 2 2 x 2 + ■ ■ ■ + a^x^ + + • • • . We can write 

X\ 


X = 


X2 


X r . 


as the vector whose entries are the variables contained in the quadratic form. 


Similarly, let A = 


an 

«12 

021 

®22 

®nl 

On2 


be the matrix whose entries are the coefficients of 


x 2 and XiXj from q. Note that the matrix A is not unique, and we will consider this further 
in the example below. Using this matrix A, the quadratic form can be written as q = xA Ax. 
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q = x 1 Ax 

= [ Xi x 2 ■■■ x n ] 


= [ Xx x 2 ■■■ x n ] 

= cinXi + a 22 x^ + • • • + 


an 

a.12 • 

^1 n 


Xl 

<^21 

a 22 • 

&2n 


X 2 

®nl 

«n2 ' 

O' nn 


. Xn . 


ai\X\ + ( 221^2 + • • • + CL n \X n 
CI 12 X 1 + Qj22%2 + * * * + d n 2 X n 

^ln^l “I - Q j 2n3'2 H“ * * * H~ d nn X n 
+ Oi\ 2 X\X 2 + • • • 


Let’s explore how to find this matrix A. Consider the following example. 


Example 7.55: Matrix of a Quadratic Form 


Let a quadratic form q be given by 

q = 6x\ + AxiX 2 + 3^2 

Write q in the form x 1 " Ax. 






Solution. First, let x = 

Xi 

and A = 

an ai 2 

x 2 


«21 ®22 

Then, writing q = x 1 Ax gives 



<M 

5 " 

5 " 

1 


Xi 

0 21 a 22 


. X2 . 


= an xl + a 21 x 1 x 2 + a 12 x gx 2 + a 22 x\ 

Notice that we have an X\X 2 term as well as an a^i term. Since multiplication is 
commutative, these terms can be combined. This means that q can be written 

q = anxl + (a 2 i + ai 2 ) x±x 2 + a 22 x \ 

Equating this to q as given in the example, we have 

anxj + (a 2 i + ai 2 ) xix 2 + a 22 x \ = 6xl + 4xix 2 + 3xl 

Therefore, 


an = 6 

a 22 = 3 
a 2 i + ai 2 = 4 
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This demonstrates that the matrix A is not unique, as there are several correct solutions 
to 021 + ci 1 2 = 4 . However, we will always choose the coefficients such that a 2 i = a 12 = 
4 (a 21 + a 12 ). This results in a 21 = a 12 = 2 . This choice is key, as it will ensure that A turns 
out to be a symmetric matrix. 

Hence, 

6 2 ' 

2 3 

You can verify that q = xF Ax holds for this choice of A. □ 


A = 


On Oi2 
0-2 1 O 22 


The above procedure for choosing A to be symmetric applies for any quadratic form q. 
We will always choose coefficients such that a tJ = a 3i . 

We now turn our attention to the focus of this section. Our goal is to start with a 
quadratic form q as given above and fold a way to rewrite it to eliminate the XiXj terms. 
This is done through a change of variables. In other words, we wish to fold yi such that 

q — diiUi + ^22 y\ + • • • + d nn y* 


Letting y = 


Vi 

V2 


and D 


[. dij ], we can write q = y 1 Dy where D is the matrix of 


Vn 

coefficients from q. There is something special about this matrix D that is crucial. Since no 
y^yj terms exist in q, it follows that d i3 = 0 for all i ^ j . Therefore, D is a diagonal matrix. 
Through this change of variables, we fold the principal axes yi, t/ 2 , • • ■ ,y n of the quadratic 
form. 

This discussion sets the stage for the following essential theorem. 
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While not a formal proof, the following discussion should convince you that the above 
theorem holds. Let q be a quadratic form in the variables Xi, • • • , x n . Then, q can be written 
in the form q = x T Ax for a symmetric matrix A. By Theorem 7.43 we can orthogonally 
diagonalize the matrix A such that U T AU = D for an orthogonal matrix U and diagonal 
matrix D. 


Then, the vector y = 


V i 
V2 


is found by y — U T x. To see that this works, rewrite 


Vn 

y = U T x as x = Uy. Letting q = x T Ax, proceed as follows: 


q = x T Ax 

= (Uy) T A(Uy) 
= f(U T AU)y 
= fDy 


The following procedure details the steps for the change of variables given in the above 
theorem. 


Procedure 7.57: Diagonalizing a Quadratic Form 


Let q be a quadratic form in the variables xi, ■ ■ ■ ,x n given by 

q = a n xj + a- 22%2 H h a nn x^ + a V2 x \X 2 H 

Then, q can be written as q = duyf + • • • + d nn y ^ as follows: 

1. Write q = xA Ax for a symmetric matrix A. 

2. Orthogonally diagonalize A to be written as U T AU = D for an orthogonal matrix 
U and diagonal matrix D. 


3. Write y = 


y i 

2/2 


Then, x = Uy. 


4. The quadratic form q will now be given by 

q = d u yl H b d nn y 2 n = y 1 Dy 

where D = [dij] is the diagonal matrix found by orthogonally diagonalizing A. 


Consider the following example. 
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Example 7.58: Choosing New Axes to Simplify a Quadratic Form 


Consider the following level curve 

6x1 + 4 XiX 2 + 3^2 = 7 


shown in the following graph. 



Xl 


Use a change of variables to choose new axes such that the ellipse is oriented parallel 
to the new coordinate axes. In other words, use a change of variables to rewrite q to 
eliminate the X 1 X 2 term. 


Solution. Notice that the level curve is given by q = 7 for q = 6x\ + 4xiX 2 + 3x2- This is the 
same quadratic form that we examined earlier in Example 7.55. Therefore we know that we 
can write q = x 1 " Ax for the matrix 


A 


6 2 
2 3 


Now we want to orthogonally diagonalize A to write U T AU = D for an orthogonal matrix 
U and diagonal matrix D. The details are left to the reader, and you can verify that the 
resulting matrices are 


U 

D 


r 2 in 


V5 

75 

1 

2 

V5 

75 

7 0 ' 


0 2 



Next we write y = 


Vi 

V2 


We can now express the 
coefficients as follows: 


. It follows that x = Uy. 

quadratic form q in terms of y, using the entries from D as 


q = d u yf + d 2 2 vl 
— 7yl + 2y\ 


Hence the level curve can be written 7 y\ + 2 y\ 


by: 


7. The graph of this equation is given 
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V2 



The change of variables results in new axes such that with respect to the new axes, the 
ellipse is oriented parallel to the coordinate axes. These are called the principal axes of 
the quadratic form. □ 

The following is another example of diagonalizing a quadratic form. 


Example 7.59: Choosing New Axes to Simplify a Quadratic Form 


Consider the level curve 
shown in the following graph. 


5x\ ~ 6 xix 2 + 5x\ 


X2 



X\ 


Use a change of variables to choose new axes such that the ellipse is oriented parallel 
to the new coordinate axes. In other words, use a change of variables to rewrite q to 
eliminate the X 1 X 2 term. 


Solution. First, express the level curve as x 1 Ax where x = 


X\ 

x 2 


and A is symmetric. Let 


A = 


Cln CI12 
CI 21 Cl 22 


Then q = x T Ax is given by 


1 

a ll 

«12 


X\ 

J 

«21 

a 2 2 


x 2 


— CL11X1 + (cii2 + 021)^1^2 + 0-22^2 
Equating this to the given description for q, we have 

5 xj - 6x1X2 + 5 xl — + («i2 + a 2 i)xix 2 + 022^2 
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This implies that an = 5, a 2 2 = 5 and in order for A to be symmetric, ai 2 = a 22 = 

5 -3 
-3 5 


h(a \2 + a 2 i) = —3. The result is A = 


We can write q = x T Ax as 


[ x\ x 2 ] 


5 -3 ' 


Xi 

iO 

CO 

1 

1 


. X2 . 


= 8 


Next, orthogonally diagonalize the matrix A to write U T AU = D. The details are left to 
the reader and the necessary matrices are given by 



' \V2 

iV2' 

u = 

1-/2 

-R2 

D = 

'20' 
0 8 



Write y = 


Vi 

V2 


, such that x = Uy. Then it follows that q is given by 

q = d ll yl + d 22 yl 
— 2 y{ + 82/2 

Therefore the level curve can be written as 2 y\ + 8 y\ = 8 . 

This is an ellipse which is parallel to the coordinate axes. Its graph is of the form 



Thus this change of variables chooses new axes such that with respect to these new axes, 
the ellipse is oriented parallel to the coordinate axes. □ 


7.4.5. Exercises 

1. Find the eigenvalues and an orthonormal basis of eigenvectors for A. 

A = 

Hint: Two eigenvalues are 12 and 18. 


11 

-1 

-4 

-1 

11 

-4 

-4 

-4 

14 
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2. Find the eigenvalues and an orthonormal basis of eigenvectors for A. 


A = 


4 

1 

-2 


1 

4 

-2 


-2 

-2 

7 


Hint: One eigenvalue is 3. 

3. Find the eigenvalues and an orthonormal basis of eigenvectors for A. Diagonalize A by 
finding an orthogonal matrix U and a diagonal matrix D such that U T AU = D. 


A = 


-1 

1 

1 


1 1 

-1 1 

1 -1 


Hint: One eigenvalue is -2. 

4. Find the eigenvalues and an orthonormal basis of eigenvectors for A. Diagonalize A by 
finding an orthogonal matrix U and a diagonal matrix D such that U T AU = D. 


A = 


17 

-7 

-4 

-7 

17 

-4 

-4 

-4 

14 


Hint: Two eigenvalues are 18 and 24. 

5. Find the eigenvalues and an orthonormal basis of eigenvectors for A. Diagonalize A by 
finding an orthogonal matrix U and a diagonal matrix D such that U T AU = D. 


A = 


13 1 4 

1 13 4 

4 4 10 


Hint: Two eigenvalues are 12 and 18. 

6. Find the eigenvalues and an orthonormal basis of eigenvectors for A. Diagonalize A by 
finding an orthogonal matrix U and a diagonal matrix D such that U T AU = D. 



5 

3 



A = 

^vW5 

14 

5 





7 

15 

Hint: The eigenvalues are —3 

;,-2,i. 
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7. Find the eigenvalues and an orthonormal basis of eigenvectors for A. Diagonalize A by 
finding an orthogonal matrix U and a diagonal matrix D such that U T AU = D. 

3 0 0 ' 

0 3 i 

2 2 

0 I 3 
2 2 

8. Find the eigenvalues and an orthonormal basis of eigenvectors for A. Diagonalize A by 
finding an orthogonal matrix U and a diagonal matrix D such that U T AU = D. 

' 2 0 0 ' 

A= 0 5 1 
0 1 5 

9. Find the eigenvalues and an orthonormal basis of eigenvectors for A. Diagonalize A by 
finding an orthogonal matrix U and a diagonal matrix D such that U T AU = D. 

| IV3V2 ±V2 

IVzVz 1 -|V3 

IV2 -IV3 | 

Hint: The eigenvalues are 0, 2, 2 where 2 is listed twice because it is a root of multi- 
plicity 2. 

10. Find the eigenvalues and an orthonormal basis of eigenvectors for A. Diagonalize A by 
finding an orthogonal matrix U and a diagonal matrix D such that U T AU = D. 

1 \VzVq 

A= |V3v/2 | ±V2V6 

IV3V6 I 

Hint: The eigenvalues are 2, 1, 0. 

11. Find the eigenvalues and an orthonormal basis of eigenvectors for the matrix 

| IV3V2 ~^V3V6 
-^V6 -±V2V6 -§ 

Hint: The eigenvalues are 1,2, —2. 
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12. Find the eigenvalues and an orthonormal basis of eigenvectors for the matrix 



1 

2 

~IV6y/5 

15^5 ' 

A = 

-IV6-/5 

7 

5 



TS ^ 


9 

10 


Hint: The eigenvalues are —1, 2, —1 where —1 is listed twice because it has multiplicity 
2 as a zero of the characteristic equation. 

13. Explain why a matrix A is symmetric if and only if there exists an orthogonal matrix 
U such that A = U T DU for D a diagonal matrix. 


14. Show that if A is a real symmetric matrix and A and y are two different eigenvalues, 
then if x is an eigenvector for A and y is an eigenvector for y, then x • y = 0. Also all 
eigenvalues are real. Supply reasons for each step in the following argument. First 


Aar x = (Ax) x = x 1 Ax = x 1 Ax = ar Xx = Aar x 


and so A = A. This shows that all eigenvalues are real. It follows all the eigenvectors 
are real. Why? Now let x, y, y and A be given as above. 


A (x • y) = A x • y = Ax • y = x • Ay = x • yy = y(x • y) = y (x • y) 


and so 


(A — y) x • y = 0 


Why does it follow that x • y = 0? 


15. Using the Gram Schmidt process or the QR factorization, find an orthonormal basis 
for the following span: 



16. Using the Gram Schmidt process or the 
for the following span: 


QR factorization, find an orthonormal basis 
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17. A quadratic form in three variables is an expression of the form a^x 2 + a 2 ?/ 2 + a^z 2 + 
a 4 xy + a 5 a; 2 ; + a 6 yz. Show that every such quadratic form may be written as 


[ x y z ] A 


x 


y 

z 


where A is a symmetric matrix. 

18. Given a quadratic form in three variables, x,y, and z, show there exists an orthogonal 
matrix U and variables xf,y',z' such that 


X 


x' 

y 

= U 

1/ 

z 


. zj . 


with the property that in terms of the new variables, the quadratic form is 

Ai {x'f + A 2 {y'f + A 3 {z'f 


where the numbers, Ai, A 2 , and A 3 are the eigenvalues of the matrix A in Problem 17. 

19. Consider the quadratic form q given by q — 3x\ — 12x±X2 — 2x\. 

(a) Write q in the form xA Ax for an appropriate symmetric matrix A. 

(b) Use a change of variables to rewrite q to eliminate the X 1 X 2 term. 

20. Consider the quadratic form q given by q — —2x\ + 2x\X 2 — 2x\. 

(a) Write q in the form x T Ax for an appropriate symmetric matrix A. 

(b) Use a change of variables to rewrite q to eliminate the X 1 X 2 term. 

21. Consider the quadratic form q given by q — 7x\ + &X\X 2 — x\. 

(a) Write q in the form x T Ax for an appropriate symmetric matrix A. 

(b) Use a change of variables to rewrite q to eliminate the aqa ; 2 term. 
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8. Some Curvilinear Coordinate Systems 


8.1 Polar Coordinates and Polar Graphs 


Outcomes 


A. Understand polar coordinates. 

B. Convert points between Cartesian and polar coordinates. 


You have likely encountered the Cartesian coordinate system in many aspects of mathe- 
matics. There is an alternative way to represent points in space, called polar coordinates. 
The idea is suggested in the following picture. 


y 



(x,y) 

M) 


Consider the point above, which would be specified as (x, y) in Cartesian coordinates. 
We can also specify this point using polar coordinates, which we write as (r, 6). The number 
r is the distance from the origin(0, 0) to the point, while 9 is the angle shown between the 
positive x axis and the line from the origin to the point. In this way, the point can be 
specified in polar coordinates as (r, 9). 

Now suppose we are given an ordered pair (r, 9) where r and 6 are real numbers. We 
want to determine the point specified by this ordered pair. We can use 6 to identify a ray 
from the origin as follows. Let the ray pass from (0,0) through the point (cos 9, sin 9) as 
shown. 
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The ray is identified on the graph as the line from the origin, through the point (cos(0), sin(0)). 
Now if r > 0, go a distance equal to r in the direction of the displayed arrow starting at 
(0,0). If r < 0, move in the opposite direction a distance of |r|. This is the point determined 
by (r,0). 

It is common to assume that 9 is in the interval [0, 2i r) and r > 0. In this case, there is 
a very simple relationship between the Cartesian and polar coordinates, given by 

x — r cos (0) , y = rsm(9) (8.1) 


These equations demonstrate how to find the Cartesian coordinates when we are given 
the polar coordinates of a point. They can also be used to find the polar coordinates when 
we know (x,y). A simpler way to do this is the following equations: 


r = \J x 1 + y 2 
tan ( 6 ) = | 


( 8 . 2 ) 


In the next example, we look at how to find the Cartesian coordinates of a point specified 
by polar coordinates. 



Solution. The point is specified by the polar coordinates (5, 7 t/ 6). Therefore r = 5 and 
9 = tt/ 6. From 8.1 


x = r cos (9) = 5 cos (— = — a/3 
Vo/ 2 

y = r sin (9) = 5 sin j ^ 

Thus the Cartesian coordinates are (|\/3, |). The point is shown in the below graph. 
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□ 


Consider the following example of the case where r < 0. 



Solution. For the point specified by the polar coordinates (—5, 7t/6), r = —5, and x9 = n/Q. 
From 8.1 


/ /j p \ ^ 

x = r cos ( 6 ) = —5 cos ( — ) = — -\/3 

V 6 / 2 


V = r sin (6) = -5 sin = -- 

Tims the Cartesian coordinates are ( — 1\/3, — f ) • The point is shown in the following graph. 




Recall from the previous example that for the point specified by (5,7t/6), the Cartesian 
coordinates are (|\/3, |). Notice that in this example, by multiplying r by —1, the resulting 
Cartesian coordinates are also multiplied by —1. □ 

The following picture exhibits both points in the above two examples to emphasize how 
they are just on opposite sides of (0, 0) but at the same distance from (0, 0). 
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In the next two examples, we look at how to convert Cartesian coordinates to polar 
coordinates. 


Example 8.3: Finding Polar Coordinates 


Suppose the Cartesian coordinates of a point are (3, 4) . Find a pair of polar coordinates 
which correspond to this point. 


Solution. Using equation 8.2, we can find r and 0. Hence r = \/3 2 + 4 2 = 5. It remains to 
identify the angle 0 between the positive x axis and the line from the origin to the point. 
Since both the x and y values are positive, the point is in the first quadrant. Therefore, 0 is 
between 0 and 7t/2 . Using this and 8.2, we have to solve: 

tan (0) = ^ 

Conversely, we can use equation 8.1 as follows: 

3 = 5 cos (0) 

4 = 5 sin (0) 

Solving these equations, we find that, approximately, 0 = 0. 927295 radians. □ 

Consider the following example. 


Example 8.4: Finding Polar Coordinates 


Suppose the Cartesian coordinates of a point are ( — x/3, l) ■ Find the polar coordinates 
which correspond to this point. 


Solution. Given the point ( — x/3, l), 

r = \j l 2 F (—Vs) 2 

= vTT3 

= 2 


364 


In this case, the point is in the second quadrant since the x value is negative and the y value 
is positive. Therefore, 9 will be between 7t/2 and 7 r. Solving the equations 

- — V^3 = 2 cos ( 9 ) 

1 = 2 sin {9) 

we find that 9 = 57t/6. Hence the polar coordinates for this point are (2, 57t/6). □ 


Consider this example. Suppose we used r = —2 and 9 — 2n — (tt/ 6) = ll7r/6. These 
coordinates specify the same point as above. Observe that there are infinitely many ways to 
identify this particular point with polar coordinates. I 11 fact, every point can be represented 
with polar coordinates in infinitely many ways. Because of this, it will usually be the case 
that 9 is confined to lie in some interval of length 2n and r > 0, for real numbers r and 9. 

Just as with Cartesian coordinates, it is possible to use relations between the polar 
coordinates to specify points in the plane. The process of sketching the graphs of these 
relations is very similar to that used to sketch graphs of functions in Cartesian coordinates. 
Consider a relation between polar coordinates of the form, r — f (9). To graph such a 
relation, first make a table of the form 


9 

r 

61 

m) 

e 2 

m) 




Graph the resulting points and connect them with a curve. The following picture illustrates 
how to begin this process. 



To find the point in the plane corresponding to the ordered pair (/ ( 9 ) ,9), we follow the 
same process as when finding the point corresponding to (r, 9). 

Consider the following example of this procedure, incorporating computer software. 



Solution. We will use the computer software Maple to complete this example. The command 
which produces the polar graph of the above equation is: > plot(l+cos(t),t= 0..2*Pi,coords=polar). 
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Here we use t to represent the variable 9 for convenience. The command tells Maple that r 
is given by 1 + cos (f) and that t G [0, 2tt], 



The above graph makes sense when considered in terms of trigonometric functions. Sup- 
pose 9 = 0,r = 2 and let 9 increase to tt/2. As 9 increases, cos 9 decreases to 0. Thus the 
line from the origin to the point on the curve should get shorter as 9 goes from 0 to tt/2. 
As 9 goes from tt/2 to tt, cos 9 decreases, eventually equaling — 1 at 9 = tt. Thus r = 0 at 
this point. This scenario is depicted in the above graph, which shows a function called a 
cardioid. 

The following picture illustrates the above procedure for obtaining the polar graph of 
r = 1 + cos(0). In this picture, the concentric circles correspond to values of r while the 
rays from the origin correspond to the angles which are shown on the picture. The dot on 
the ray corresponding to the angle 7t/6 is located at a distance of r — 1 + cos(7t/6) from 
the origin. The dot on the ray corresponding to the angle n/3 is located at a distance of 
r = 1 + cos(7t/3) from the origin and so forth. The polar graph is obtained by connecting 
such points with a smooth curve, with the result being the figure shown above. 


n 



2 


□ 
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Consider another example of constructing a polar graph. 



Solution. The graph of the polar equation r = 1 + 2 cos 6 for 6 G [0, 27r] is given as follows. 



To see the way this is graphed, consider the following picture. First the indicated points 
were graphed and then the curve was drawn to connect the points. When done by a computer, 
many more points are used to create a more accurate picture. 

Consider first the following table of points. 


e 

7t/6 

7t/3 

tt/2 

57t/6 

71 

47t/3 

7tt/6 

5tt/3 

r 

y/3+1 

2 

1 

1 - y/3 

-1 

0 

1 - y/3 

2 


Note how some entries in the table have r < 0. To graph these points, simply move in 
the opposite direction. These types of points are responsible for the small loop on the inside 
of the larger loop in the graph. 
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JC 



2 


□ 

The process of constructing these graphs can be greatly facilitated by computer software. 
However, the use of such software should not replace understanding the steps involved. 

The next example shows the graph for the equation r = 3 + sin — ) . For complicated 

V 6 / 

polar graphs, computer software is used to facilitate the process. 



Solution. 



The next example shows another situation in which r can be negative. 


□ 
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Example 8.8: A Polar Graph: Negative r 


Graph r = 3sin(49) for 9 G [0, 27r]. 


Solution. 



We conclude this section with an interesting graph of a simple polar equation. 



Solution. The graph of this polar equation is a spiral. This is the case because as 9 increases, 
so does r. 



In the next section, we will look at two ways of generalizing polar coordinates to three 
dimensions. 
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8.1.1. Exercises 


1. In the following, polar coordinates (r, 0) for a point in the plane are given. Find the 
corresponding Cartesian coordinates. 


(a) (2, 7 t/4) 

(b) (—2, 7 t/4) 

(c) (3, 7t/3) 

(d) (—3, 7 t/3) 

(e) (2,5 tt/6) 

(f) (-2, 11 tt/6) 

(g) (2, tt/2) 

(h) (1, 3 tt/2) 

(i) (-3,3 tt/4) 

(j) (3,5 tt/4) 

(k) (—2, 7 t/6) 


2. Consider the following Cartesian coordinates (x, y). Find polar coordinates correspond- 
ing to these points. 

(a) (-1,1) 

(b) (73, -l) 

(c) (0,2) 

(d) (-5,0) 

(e) (-273,2) 

(f) (2,-2) 

(g) (-1,73) 

(h) (-1,-73) 

3. The following relations are written in terms of Cartesian coordinates (x,y). Rewrite 
them in terms of polar coordinates, (r, 6). 

(a) y = x 2 

(b) y = 2x + 6 

(c) x 2 + y 2 = 4 

(d) x 2 — y 2 = 1 

4. Use a calculator or computer algebra system to graph the following polar relations. 
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(a) r = 1 — sin (20) , 0 G [0, 27r] 

(b) r = sin (40 ) , 0 G [0, 27r] 

(c) r = cos (36*) + sin (20 ) , 0 G [0, 27r] 

(d) r = 0, 0 G [0, 15] 

5. Graph the polar equation r = 1 + sin 6* for 0 G [0, 27t] . 

6. Graph the polar equation r = 2 + sin 0 for 0 G [0, 27r]. 

7. Graph the polar equation r = 1 + 2 sin 6* for 0 G [0, 27t]. 

8. Graph the polar equation r = 2 + sin (20) for 0 G [0, 2n]. 

9. Graph the polar equation r = 1 + sin (20) for 0 G [0, 2n). 

10. Graph the polar equation r = 1 + sin (30) for 0 G [0, 2tt\. 

11. Describe how to solve for r and 0 in terms of x and y in polar coordinates. 

12. This problem deals with parabolas, ellipses, and hyperbolas and their equations. Let 
l, e > 0 and consider 

l 

r = 

lie cos 0 

Show that if e = 0, the graph of this equation gives a circle. Show that if 0 < e < 1, 
the graph is an ellipse, if e = 1 it is a parabola and if e > 1, it is a hyperbola. 

8.2 Spherical and Cylindrical Coordinates 


Outcomes 


A. Understand cylindrical and spherical coordinates. 

B. Convert points between Cartesian, cylindrical, and spherical coordinates. 


Spherical and cylindrical coordinates are two generalizations of polar coordinates to three 
dimensions. We will first look at cylindrical coordinates . 

When moving from polar coordinates in two dimensions to cylindrical coordinates in 
three dimensions, we use the polar coordinates in the xy plane and add a z coordinate. For 
this reason, we use the notation (r, 0, z) to express cylindrical coordinates. The relationship 
between Cartesian coordinates (x, y, z) and cylindrical coordinates (r, 0, z) is given by 

x — r cos (0) 
y = r sin (0) 

z = z 
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where r > 0, 9 € [0,27t), and z is simply the Cartesian coordinate. Notice that x and y are 
defined as the usual polar coordinates in the xy-plane. Recall that r is defined as the length 
of the ray from the origin to the point (x,y, 0), while 9 is the angle between the positive 
x-axis and this same ray. 

To illustrate this coordinate system, consider the following two pictures. In the first of 
these, both r and z are known. The cylinder corresponds to a given value for r. A useful way 
to think of r is as the distance between a point in three dimensions and the z-axis. Every 
point on the cylinder shown is at the same distance from the z-axis. Giving a value for z 
results in a horizontal circle, or cross section of the cylinder at the given height on the z axis 
(shown below as a black line on the cylinder). In the second picture, the point is specified 
completely by also knowing 9 as shown. 



r and z are known 


r, 9 and z are known 


Every point of three dimensional space other than the z axis has unique cylindrical 
coordinates. Of course there are infinitely many cylindrical coordinates for the origin and 
for the z-axis. Any 9 will work if r — 0 and z is given. 

Consider now spherical coordinates, the second generalization of polar form in three 
dimensions. For a point (x, y, z ) in three dimensional space, the spherical coordinates are 
defined as follows. 

p : the length of the ray from the origin to the point 

9 : the angle between the positive x-axis and the ray from the origin to the point (x, y, 0) 

0 : the angle between the positive z-axis and the ray from the origin to the point of interest 

The spherical coordinates are determined by ( p, 0, d ). The relation between these and the 
Cartesian coordinates (x, y, z) for a point are as follows. 

x = p sin (0) cos (9) , 0 £ [0, 7 r] 
y = p sin (0) sin (9) , 9 G [0, 2n) 
z = p COS 0, p > 0. 

Consider the pictures below. The first illustrates the surface when p is known, which is a 
sphere of radius p. The second picture corresponds to knowing both p and 0, which results 
in a circle about the z-axis. Suppose the first picture demonstrates a graph of the Earth. 
Then the circle in the second picture would correspond to a particular latitude. 
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p is known 


p and (f) are known 


Giving the third coordinate, 6 completely specifies the point of interest. This is demon- 
strated in the following picture. If the latitude corresponds to (j), then we can think of 6 as 
the longitude. 


z 



p, <f> and 9 are known 


The following picture summarizes the geometric meaning of the three coordinate systems. 


z 
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Therefore, we can represent the same point in three ways, using Cartesian coordinates, 
(. x,y,z ), cylindrical coordinates, ( r,9,z ), and spherical coordinates (p, 0, 0). 

Using this picture to review, call the point of interest P for convenience. The Cartesian 
coordinates for P are (x,y,z). Then p is the distance between the origin and the point P. 
The angle between the positive z axis and the line between the origin and P is denoted by 
0. Then 9 is the angle between the positive x axis and the line joining the origin to the 
point (x,y, 0) as shown. This gives the spherical coordinates, ( p,(j),d ). Given the line from 
the origin to (x, y, 0), r = psin(0) is the length of this line. Thus r and 9 determine a point 
in the xy- plane. In other words, r and 9 are the usual polar coordinates and r > 0 and 
9 G [0,27 r). Letting z denote the usual z coordinate of a point in three dimensions, ( r,9,z ) 
are the cylindrical coordinates of P. 

The relation between spherical and cylindrical coordinates is that r = psin(0) and the 9 
is the same as the 9 of cylindrical and polar coordinates. 

We will now consider some examples. 


Example 8.10: Describing a Surface in Spherical Coordinates 


Express the surface z = ~^\/ x 2 + y 2 in spherical coordinates. 


Solution. We will use the equations from above: 


x = psin (0) cos (9 ) , 0 e [0, 7r] 
y = psin (0) sin (9 ) , 9 E [0, 27t) 
z = p cos 0, p > 0 

To express the surface in spherical coordinates, we substitute these expressions into the 
equation. This is done as follows: 


p cos (0) = —j=\f (psin (0) cos ( 9 )) 2 + (psin (0) sin (9)) 2 = - y/3p sin (0) 
v 3 v 

This reduces to 
and so 0 = tt/3. 


tan (0) = \/3 


□ 


Example 8.11: Describing a Surface in Spherical Coordinates 


Express the surface y = x in terms of spherical coordinates. 


Solution. Using the same procedure as the previous example, this says psin (0) sin (9) = 
psin (0) cos ( 9 ). Simplifying, sin ( 9 ) = cos ( 9 ), which you could also write tan (9) — 1. □ 

We conclude this section with an example of how to describe a surface using cylindrical 
coordinates. 
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Example 8.12: Describing a Surface in Cylindrical Coordinates 


Express the surface x 2 + y 2 = 4 in cylindrical coordinates. 


Solution. Recall that to convert from Cartesian to cylindrical coordinates, we can use the 
following equations: 

x — r cos (9) ,y = r sin (9) , z = z 

Substituting these equations in for x, y, z in the equation for the surface, we have 

r 2 cos 2 (9) + r 2 sin 2 (9) = 4 

This can be written as r 2 (cos 2 (9) + sin 2 (9)) = 4. Recall that cos 2 (9) + sin 2 (9) = 1. Thus 
r 2 = 4 or r = 2. □ 

8.2.1. Exercises 


1. The following are the cylindrical coordinates of points, ( r,9,z ). Find the Cartesian 
and spherical coordinates of each point. 

(a) (5,f, -3) 

(b) (3.1,4) 

M df.i) 

(d) (2,f,-2) 

(<0 (3,f,-l) 

(f) (8, ill, -11) 

2. The following are the Cartesian coordinates of points, ( x,y,z ). Find the cylindrical 
and spherical coordinates of these points. 


(a) (§72,172,-3) 

(b) (f, |V3, 2) 

(c) (-§ 72 ,§ 72 , 11 ) 

(d) (-§,§73,23) 

(e) (-73, -1,-5) 

(f) (§,-§73,-7) 

(g) (72,76,272) 

(h) (-|V3,!,1) 

(i) (-§72, §72, -§73) 
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(j) (-V3,1,2V3) 

(k) (-1V2,JV6,4V2) 

3. The following are spherical coordinates of points in the form (p, 0, 0). Find the Carte- 
sian and cylindrical coordinates of each point. 

M (M-t) 

(b) (2,f.f) 

(d) (4. f> i 1 ) 

W (4. f.f) 

(f) (4,f,¥) 

4. Describe the surface 0 = 7r/4 in Cartesian coordinates, where 0 is the polar angle in 
spherical coordinates. 

5. Describe the surface 0 = 7t/4 in spherical coordinates, where 6 is the angle measured 
from the positive x axis. 

6. Describe the surface r — 5 in Cartesian coordinates, where r is one of the cylindrical 
coordinates. 

7. Describe the surface p = 4 in Cartesian coordinates, where p is the distance to the 
origin. 

8. Give the cone described by z — \Jx 2 + y 2 in cylindrical coordinates and in spherical 
coordinates. 

9. The following are described in Cartesian coordinates. Rewrite them in terms of spher- 
ical coordinates. 

(a) z = x 2 + y 2 . 

(b) x 2 — y 2 = 1. 

(c) z 2 + x 2 + y 2 = 6. 

(d) z = \Jx 2 + y 2 . 

(e) y = x. 

(f) z = x. 

10. The following are described in Cartesian coordinates. Rewrite them in terms of cylin- 
drical coordinates. 


(a) z = x 2 + y 2 . 

(b) x 2 — y 2 = 1. 
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(c) z 2 + x 2 + y 2 = 6. 

(d) z = yjx 2 + y 2 . 

(e) y = x. 

(f) Z — X. 
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9. Vector Spaces 


9.1 Algebraic Considerations 


Outcomes 


A. Develop the abstract concept of a vector space through axioms. 

B. Deduce basic properties of vector spaces. 

C. Use the vector space axioms to determine if a set and its operations constitute 
a vector space. 


In this section we consider the idea of an abstract vector space. A vector space is 
something which has two operations satisfying the following vector space axioms. In the 
following definition we define two operations; vector addition, denoted by + and scalar 
multiplication denoted by placing the scalar next to the vector. A vector space need not 
have usual operations, and for this reason the operations will always be given in the definition 
of the vector space. The below axioms for addition (written +) and scalar multiplication 
must hold for however addition and scalar multiplication are defined for the vector space. 

It is important to note that we have seen much of this content before, in terms of R n . We 
will prove in this section that M n is an example of a vector space and therefore all discussions 
in this chapter will pertain to M n . While it may be useful to consider all concepts of this 
chapter in terms of M n , it is also important to understand that these concepts apply to all 
vector spaces. 

In the following definition, we will choose scalars a, b to be real numbers and are thus 
dealing with real vector spaces. However, we could also choose scalars which are complex 
numbers. In this case, we would call the vector space V complex. 
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Definition 9.1: Vector Space Axioms 


A vector space V is a set of vectors with two operations defined, addition and scalar 
multiplication. Let v, w, z be vectors in V. Then they satisfy the following axioms of 
addition: 

• Closed under Addition 

If v, w are in V, then v + w is also in V. 

• The Commutative Law of Addition 

v + w = w + v 

• The Associative Law of Addition 

(v + w) + Z = V + (w + z) 

• The Existence of an Additive Identity 

v + 0 = v 

• The Existence of an Additive Inverse 

v + (—v) = 0 

Let a, b e M. The following axioms apply to the operation of scalar multiplication. 

• Closed under Scalar Multiplication 

If a is a real number, and v is in V, then av is in V. 

a (v + w) = av + aw 
(a + b) v = av + bv 
a (bv) = (ab)v 

lv — v 


Consider the following example, in which we prove that M” is in fact a vector space. 
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Example 9.2: W 1 


M n , under the usual operations of vector addition and scalar multiplication, is a vector 
space. 


Solution. To show that M n is a vector space, we need to show that the above axioms hold. 
Let x,y, z be vectors in M n . We first prove the axioms for vector addition. 

• To show that is closed under addition, we must show that for two vectors in M n 
their sum is also in M”. The sum x + y is given by: 


X\ 


Vl 


Xi +yi 

X2 

+ 

V2 

= 

X2 + V2 

. Xn . 


Vn 


Xn T Vn 


The sum is a vector with n entries, showing that it is in M n . Hence M n is closed under 
vector addition. 

• To show that addition is commutative, consider the following: 



X, 


2/i 

x + y = 

X 2 

+ 

2/2 


. Xn . 


yn 


X\ + yi 


= 

X 2 + y 2 



Xn + yn 

yi + xi 


= 

2/2 + X 2 



_ Vn ■+ 

x n 



2/1 


Xl 

= 

2/2 

+ 

X2 


yn 


Xn 


= y + x 


Hence addition of vectors in M" is commutative. 
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• We will show that addition of vectors in M” is associative in a similar way. 


(x + y) + z 


X\ 


y\ 

\ 


Z\ 

X2 

+ 

y2 


+ 

Z2 

x n 


yn 

) 


Zn 


x\ + yi 


Zl 

X2 + y2 

+ 

Z 2 

Xn "h yn 


Zn 


{xi + y i) + z! 
(%2 + I/2) + %2 


{X n +y n ) + Zn 

xi + ( yi + zi) 
x 2 + (l/2 + Z 2 ) 


Xn + ( y„ + Zn) 


X\ 

X2 

%n. 


y 1 + zi 

y2 + ^2 

yn ~ 1 " z n 


X\ 


/ 

yi 


Z 1 

\ 

X2 

+ 


y2 

+ 

Z2 


X n _ 


V 

yn 


Zn 

/ 


= x + (y + z) 


Hence addition of vectors is associative. 


• Next, we show the existence of an additive identity. Let 0 = 


0 

0 

0 
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x + 0 = 

X\ 

X 2 

+ 

' 0 " 

0 


x n 


0 


Xi + 0 

x 2 + 0 



l 

o 

.. + 

e 

H 

1 



Xi 

x 2 

x n 


Hence the zero vector 0 is an additive identity. 


• Next, we prove the existence of an additive inverse. Let — x = 


~Xi 

-x 2 


—x 


n 



X\ 


-X\ 

x + (—x) = 

X 2 

+ 

-x 2 




- X n 


X\ — X\ 
X 2 - X 2 


Xn 


— X 


n 


0 

0 


0 


0 


Hence —x is an additive inverse. 

We now need to prove the axioms related to scalar multiplication. Let a, b be real numbers 
and let x, y be vectors in M n . 
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• We first show that M n is closed under scalar multiplication. To do so, we show that ax 
is also a vector with n entries. 



X\ 


ax i 

ax = a 

x 2 

= 

ax 2 


O'n 


ax n 


The vector ax is again a vector with n entries, showing that M n is closed under scalar 
multiplication. 

• We wish to show that a(x + y) = ax + ay. 


a(x + y) 


X\ 


X\ 

\ 

X 2 

+ 

X 2 


. Xn . 


. Xn . 

/ 


X\ +Vi 
X 2 + V2 

Xn T Un 


a(x i + Vi) 
a{x 2 + y 2 ) 


d(x n T y n ) 

clx i + ay\ 
ax 2 + ay 2 


ax ri + ay n 


a^i 


ayi 

ax 2 

+ 

ay 2 

ax n 


1 

e 

C3 


— ax + ay 


• Next, we wish to show that (a + b)x = ax + bx. 
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(a + b)x 


(a + b ) 

[ x n J 

(a + 6)xi 
(a + b)x 2 

(a + 5)x n 

ax i + bx i 
ax2 + 6x2 


ax n + bx n 



ax + bx 
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We wish to show that a(bx) = ( ab)x . 
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By the above proofs, it is clear that M n satisfies the vector space axioms. Hence, M n is a 
vector space under the usual operations of vector addition and scalar multiplication. □ 

We now consider some examples of vector spaces. 


Example 9.3: Vector Space of Polynomials 


Let P 2 be the set of all polynomials of at most degree 2 as well as the zero polynomial. 
Define addition to be the standard addition of polynomials, and scalar multiplication 
the usual multiplication of a polynomial by a number. Then P 2 is a vector space. 


Solution. We can write P 2 explicitly as 

P 2 = {a 2 x 2 + a\X + a 0 \a.i 6 M. for all i } 

To show that P 2 is a vector space, we verify the axioms. Let p(x),q(x),r(x) be polynomials 
in P 2 and let a, b, c be real numbers. Write p{x) = Po + P\X + p 2 x 2 , q(x) = qo + qix + g 2 x 2 , 
and r{x) = r 0 + r\X + r 2 x 2 . 


• We first prove that addition of polynomials in P 2 is closed. For two polynomials in P 2 
we need to show that their sum is also a polynomial in P 2 . From the definition of P 2 , 
a polynomial is contained in P 2 if it is of degree at most 2 or the zero polynomial. 

P( x ) + q(x) = p 2 x 2 + pix +Po + q2X 2 + qgx + q 0 
= (P2 + q2)x 2 + {pi + qi)x + (po + go) 


The sum is a polynomial of degree 2 and therefore is in P 2 . It follows that P 2 is closed 
under addition. 


• We need to show that addition is commutative, that is p(x) + q(x) = q(x) + p{x). 


p{x) + q{x) 


p 2 x 2 + pix +po + q2X 2 + q\X + q 0 
(P2 + q2)x 2 + (pi + qi)x + (p 0 + ?o) 
(<?2 + P2)x 2 + {qi + Pi)x + (g 0 + Po) 
q 2 x 2 + q\X + q 0 + p 2 x 2 + prx + p 0 
q(x) + p(x) 


• Next, we need to show that addition is associative. That is, that (p(x) + q(x)) +r(x) = 
p(x) + {q(x) + r(x)). 


(p(x) + q(x)) + r(x) 


(p 2 x 2 + p x x +Po + q2X 2 + q\X + q 0 ) + r 2 x 2 + r ± x + r 0 
(P2 + q2)x 2 + (pi + qi)x + (p 0 + qo) + r2X 2 + r x x + r 0 
(P2 + q2 + r 2 )x 2 + (pi + qi + n)x + (p 0 + go + ^0) 
p 2 x 2 + pix + Po+ (g2 + r 2 )x 2 + (g! + r^x + (g 0 + r 0 ) 
p 2 x 2 + prx + po + (q2X 2 + q\X + g 0 + r 2 x 2 + r^x + r 0 ) 
p{x) + (q(x) + r(x)) 
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Next, we must prove that there exists an additive identity. Let 0(x) = Ox 2 + Cte + 0. 


p(x) + 0(x) = P 2 X 2 + pix + po + Ox 2 + Ox + 0 
= (P 2 + 0)x 2 + (pi + 0)a: + (po + 0) 

= p 2 X 2 + PiX + Po 
= p(x) 

Hence an additive identity exists, specifically the zero polynomial. 

• Next we must prove that there exists an additive inverse. Let —p(x) = —p 2 X 2 —pix—po 
and consider the following: 

p(x) + (-p(x)) = p 2 x 2 + Pix + po + (~P 2 x 2 - Pix - po) 

= (P2 ~ P2)X 2 + (pi ~ Pl)x + ( po - Po) 

= Ox 2 + Cte + 0 

= 0(a:) 

Hence an additive inverse —p(x) exists such that p(x) + (— p(x)) = 0(x). 

We now need to verify the axioms related to scalar multiplication. 

• First we prove that P 2 is closed under scalar multiplication. That is, we show that 
ap(x) is also a polynomial of degree at most 2. 

ap(x) = a (p2X 2 + p\x + po) = ap2X 2 + ap\x + apo 

Therefore P 2 is closed under scalar multiplication. 

• We need to show that a(p(x) + q(x)) = ap(x) + aq(x). 

a(p(x) + q(x)) = a [p 2 x 2 + pix + po + q2X 2 + qrx + q 0 ) 

= a {(p 2 + q 2 )x 2 + (pi + qi)x + (p 0 + q 0 )) 

= a{P 2 + q 2 )x 2 + a(pi + q x )x + a(p 0 + q 0 ) 

= ( ap 2 + aq 2 )x 2 + (api + aqi)x + ( ap 0 + aq 0 ) 

= ap 2 x 2 + ap\X + apo + aq 2 x 2 + aq\X + aq 0 
= ap(x) + aq(x) 

• Next we show that (a + b)p(x) = ap(x) + bp(x). 

(a + b)p(x) = ( a + b ) (p 2 x 2 + p t x + p 0 ) 

= ( a + b)p 2 x 2 + (a + b)piX + (a + b)p 0 
= ap 2 x 2 + ap\X + apo + bp 2 x 2 + bp x x + bp 0 
= ap(x) + bp(x) 
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The next axiom which needs to be verified is a(bp(x )) = (ab)p(x). 


a(bp(x )) 


a ( b ( p 2 x 2 + pix + p 0 )) 
a (bp 2 x 2 + bp\x + bp 0 ) 
abp 2 x 2 + abpix + abp 0 
(ah) ( p 2 x 2 + pix + po) 
(ab)p(x) 


• Finally, we show that 1 p{x) = p(x). 


1 p(x) = 1 ( p 2 X 2 + PiX + Po) 

= 1 p 2 X 2 + IpiX + lpo 

= p 2 x 2 + PxX + Po 
= p(x) 


Since the above axioms hold, we know that P2 as described above is a vector space. □ 
Another important example of a vector space is the set of all matrices of the same size. 


Example 9.4: Vector Space of Matrices 


Let M2, 3 be the set of all 2 x 3 matrices. Using the usual operations of matrix addition 
and scalar multiplication, show that M2, 3 is a vector space. 


Solution. Let A, B be 2 x 3 matrices in M2, 3. We first prove the axioms for addition. 


In order to prove that M2, 3 is closed under matrix addition, we show that the sum 
A + B is in M2, 3. This means showing that A + Bisa 2 x 3 matrix. 


A + B = 


+ 

bn 

bi 2 

bi 3 

b 2 i 

b 22 

b 2 3 


an ai 2 013 
a 2 \ a 22 a 2 3 

flu + b\ 1 ai2 + bi 2 a 13 + 6 13 
0 j 2 \ + b 2 \ a 22 + b 22 a 2 o + 623 

You can see that the sum is a 2 x 3 matrix, so it is in M2, 3. It follows that M2, 3 is 
closed under matrix addition. 

• The remaining axioms regarding matrix addition follow from properties of matrix addi- 
tion, found in Proposition 2 . 7 . Therefore M2, 3 satisfies the axioms of matrix addition. 

We now turn our attention to the axioms regarding scalar multiplication. Let A , B be 
matrices in M 2 ,3 and let c be a real number. 
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• We first show that M 2j3 is closed under scalar multiplication. That is, we show that 
cA a 2 x 3 matrix. 


a ll «12 «13 

a 2 i a 22 a 2 3 

Cd\\ CCl\2 CQ -13 
C(l2\ CCL22 CCI23 

This is a 2 x 3 matrix in M 2 3 which proves that the set is closed under scalar multi- 
plication. 

• The remaining axioms of scalar multiplication follow from properties of scalar multipli- 
cation of matrices, from Proposition 2.10. Therefore M 2; 3 satisfies the axioms of scalar 
multiplication. 

In conclusion, M 2 3 satisfies the required axioms and is a vector space. □ 

While here we proved that the set of all 2x3 matrices is a vector space, there is nothing 
special about this choice of matrix size. In fact if we instead consider M m n , the set of all 
m x n matrices, then n is a vector space under the operations of matrix addition and 
scalar multiplication. 

We now examine an example of a set that does not satisfy all of the above axioms, and 
is therefore not a vector space. 


Example 9.5: Not a Vector Space 


Let V denote the set of 2x3 matrices. Let addition in V be defined by A + B = A for 
matrices A, B in V. Let scalar multiplication in V be the usual scalar multiplication 
of matrices. Show that V is not a vector space. 



Solution. In order to show that V is not a vector space, it suffices to find only one axiom 
which is not satisfied. We will begin by examining the axioms for addition until one is found 
which does not hold. Let A, B be matrices in V. 


• We first want to check if addition is closed. Consider A + B. By the definition of 
addition in the example, we have that A + B = A. Since A is a 2 x 3 matrix, it follows 
that the sum A + B is in V, and V is closed under addition. 

• We now wish to check if addition is commutative. That is, we want to check if A + B = 
B + A for all choices of A and B in V. From the definition of addition, we have that 
A + B = A and B + A = B. Therefore, we can find A, B in V such that these sums 
are not equal. One example is 


A 


1 0 0 
0 0 0 




0 0 0 
1 0 0 
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Using the operation defined by A + B = A, we have 

A + B = A 

'10 0 ' 

0 0 0 

B + A = B 

'0 0 0 ' 

10 0 

It follows that A + B ^ B + A. Therefore addition as defined for V is not commutative 
and V fails this axiom. Hence V is not a vector space. 


□ 


Consider another example of a vector space. 


Example 9.6: Vector Space of Functions 


Let S be a nonempty set and define F s to be the set of real functions defined on S. In 
other words, we write Fg : S i-» M. Letting a, b, c be scalars and /, g, h functions, the 
vector operations are defined as 

(/ + 9 ) (x) = f (x) + g ( x ) 

(°f) (x) = a(f(x )) 

Show that ¥ s is a vector space. 


Solution. To verify that F5 is a vector space, we must prove the axioms beginning with those 
for addition. Let f,g , h be functions in Fg. 

• First we check that addition is closed. For functions f,g defined on the set S, their 
sum given by 

(f + g)(x) = f{x)+g(x) 

is again a function defined on S. Hence this sum is in F5 and F5 is closed under 
addition. 

• Secondly, we check the commutative law of addition: 

(/ + 9) (x) = f(x) + g(x)= g (x) + f(x) = (g + /) (x) 

Since x is arbitrary, f + g = g + f ■ 

• Next we check the associative law of addition: 

((/ + g) + h) 0) = (f + g) 0) + h (x) = (/ (x) + g (x)) + h (x) 

= f(x) + (g (x) + h (x)) = (/ (x) + (g + h) (x)) = (/ + (g + h)) (x) 
and so (/ + g) + h = f + (g + h ) . 
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• Next we check for an additive identity. Let 0 denote the function which is given by 
0 (x) = 0. Then this is an additive identity because 

(/ + 0) (x) = / (x) + 0 (x) = / (x) 

and so / + 0 = /. 

• Finally, check for an additive inverse. Let — / be the function which satisfies (— /) (x) = 
— / (x) . Then 

(/ + (-/)) (x) = f (x) + (-/) (x) = / (x) + -/ (x) = 0 
Hence / + (— /) = 0. 

Now, check the axioms for scalar multiplication. 

• We first need to check that Fs is closed under scalar multiplication. For a function /(x) 
in F s and real number a, the function (a/)(x) = a(/(x)) is again a function defined 
on the set S. Hence a(f(x)) is in F 5 and F 5 is closed under scalar multiplication. 


((« + b) f ) (x) = (a + b) f (x) = af (x) + bf (x) = (af + bf ) (x) 
and so (a + b) f = af + bf. 

(a (/ + g)) (a) = a{f + g)(x) = a (/ (x) + g (x)) 

= af (x) + bg (x) = (af + bg) (x) 
and so a (/ + g) = af + bg. 

(( ab ) /) (x) = ( ab ) / (x) = a (bf (x)) = (a (6/)) (x) 

so (a&/) = a (bf). 

• Finally (1/) (x) = If ( x ) = / (x) so If = f. 

It follows that V satisfies all the required axioms and is a vector space. □ 

We conclude this section with the following important theorem. 



Proof. 
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1. When we say that the additive identity, 0, is unique, we mean that if a vector acts 
like the additive identity, then it is the additive identity. To prove this uniqueness, we 
want to show that another vector which acts like the additive identity is actually equal 
to 0. 

Suppose O' is also an additive identity. Then, 

0 + 0 ' = 0 

Now, for 0 the additive identity given above in the axioms, we have that 

O' + 0 = O' 


So by commutativity: 

0 = 0 + O' = O' + 0 = O' 

This says that if a vector acts like an additive identity (such as O'), it in fact equals 0. 
This proves the uniqueness of 0. 

2. When we say that the additive inverse, —x, is unique, we mean that if a vector acts 
like the additive inverse, then it is the additive inverse. Suppose that y acts like an 
additive inverse: 

x + y = 0 

Then the following holds: 

y — 0 + y — (—x + x) + y — —x + (x + y) — —x + 0 = — x 

Thus if y acts like the additive inverse, it is equal to the additive inverse —x. This 
proves the uniqueness of —x. 


3. This statement claims that for all vectors x, scalar multiplication by 0 equals the zero 
vector 0. Consider the following, using the fact that we can write 0 = 0 + 0: 

Ox = (0 + 0) x = Ox + Ox 

We use a small trick here: add — Ox to both sides. This gives 

Ox + (— 0T) = 0T + Ox + (— x) 

0 + 0 = Ox + 0 
0 = Ox 

This proves that scalar multiplication of any vector by 0 results in the zero vector 0. 


4. Finally, we wish to show that scalar multiplication of —1 and any vector x results in 
the additive inverse of that vector, —x. Recall from 2. above that the additive inverse 
is unique. Consider the following: 


(— 1) x + x 


(— 1 ) X + lx 
(-1 + l)x 
Ox 
0 
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By the uniqueness of the additive inverse shown earlier, any vector which acts like the 
additive inverse must be equal to the additive inverse. It follows that (—1) x = —x. 


9.1.1. Exercises 

1. Let X consist of the real valued functions which are defined on an interval [a, b] . For 
f,gE X, f + g is the name of the function which satisfies (/ + g) (x) — f (x) + g (x). 
For s a real number, (sf) (x) = s (f (x)). Show this is a vector space. 

2. Consider functions defined on {1, 2, • • • ,n} having values in M. Explain how, if V is 
the set of all such functions, V can be considered as M”. 

3. Let the vectors be polynomials of degree no more than 3. Show that with the usual defi- 
nitions of scalar multiplication and addition wherein, for p (x) a polynomial, (op) (x) = 
ap (x) and for p, q polynomials (p + q ) (x) — p(x) + q (x) , this is a vector space. 

9.2 Subspaces 


Outcomes 


A. Determine if a vector is within a given span. 

B. Utilize the subspace test to determine if a set is a subspace of a given vector 
space. 


In this section we will examine the concepts of spanning and subspaces introduced earlier 
in terms of M n . Here, we will discuss these concepts in terms of abstract vector spaces. 
Consider the following definition. 



In particular, we often speak of subsets of a vector space, such as A" C V. By this we 
mean that every element in the set X is contained in the vector space V. 
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■ ^ 
Definition 9.9: Span of Vectors 

Let {v \ , • • • , v n } C V. Then 

span {ni, • ,v n } = | 

f n 1 

^2 Wi : Q G 1 

l *= i J 

1 


When we say that a vector w is in span {fq, • • • , v n } we mean that w can be written as 
a linear combination of the v\. 

Consider the following example. 



Solution. 

First consider A. We want to see if scalars s, t can be found such that A = sM\ + tM 2 . 


1 0 
0 2 


= s 


1 0 
0 0 


+ 1 


0 0 
0 1 


The solution to this equation is given by 


1 = s 

2 = t 


and it follows that A is in span {Mi, M 2 }. 

Now consider B. Again we write B = sMi + tM 2 and see if a solution can be found for 


s , t. 


0 1 
1 0 


= s 


1 0 
0 0 


+ 1 


0 0 
0 1 


Clearly no values of s and t can be found such that this equation holds. Therefore B is not 
in span { Mi , M 2 } . □ 


Consider another example. 
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Solution. To show that p(x) is in the given span, we need to show that it can be written as 
a linear combination of polynomials in the span. Suppose scalars a, b existed such that 

7x 2 + 4x — 3 = a( 4x 2 + x) + b(x 2 — 2x + 3) 

If this linear combination were to hold, the following would be true: 

4 a + b = 7 
a — 2b = 4 
3b = -3 

You can verify that a = 2, b = — 1 satisfies this system of equations. This means that we 
can write p(x) as follows: 

7x 2 + 4x — 3 = 2(4x 2 + x) — ( x 2 — 2x + 3) 

Hence p(x) is in the given span. □ 

Consider the following example. 



Solution. Let p(x) = ax 2 + bx + c be an arbitrary polynomial in P 2 . To show that S is 
a spanning set, it suffices to show that p(x) can be written as a linear combination of the 
elements of S. In other words, can we find r, s, t such that: 

p{x) = ax 2 +bx + c = r(x 2 + 1) + s(x — 2) + t(2x 2 — x) 

If a solution r,s,t can be found, then this shows that for any such polynomial p(x), it 
can be written as a linear combination of the above polynomials and S' is a spanning set. 

ax 2 + bx + c = r(x 2 + 1) + s(x — 2) + t( 2x 2 — x) 

= rx 2 + r + sx — 2s + 2 tx 2 — tx 
= (r + 2 t)x 2 + (s — t)x + (r — 2s) 

For this to be true, the following must hold: 

a = r + 2t 
b = s — t 
c = r — 2s 

To check that a solution exists, set up the augmented matrix and row reduce: 


" 1 

0 

2 

a 


' 1 

0 

0 

tjO + 2b + tjC 

0 

1 

-1 

b 


0 

1 

0 

\ a ~ i c 

1 

-2 

0 

c 


0 

0 

1 

\a-b-\c 
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Clearly a solution exists for any choice of a, b, c. Hence S' is a spanning set for P 2 . 


□ 


Consider now the definition of a subspace. 



The span of a set of vectors as described in Definition 9.9 is an example of a subspace. 
The following fundamental result says that subspaces are subsets of a vector space which are 
themselves vector spaces. 


Theorem 9.14: Subspaces are Vector Spaces 


Let W be a nonempty collection of vectors in V, a vector space. Then W is a subspace 
if and only if W is itself a vector space having the same operations as those defined 
on V. 


Proof. Suppose first that IT is a subspace. It is obvious that all the algebraic laws hold on 
W because it is a subset of V and they hold on V. Thus u + v = v + u along with the other 
axioms. Does W contain 0? Yes because it contains On = 0. See Theorem 9.7. 

Are the operations of V defined on IT? That is, when you add vectors of IT do you get 
a vector in IT? When you multiply a vector in IT by a scalar, do you get a vector in IT? 
Yes. This is contained in the definition. Does every vector in IT have an additive inverse? 
Yes by Theorem 9.7 because — v = (—1) v which is given to be in IT provided v G W. 

Next suppose IT is a vector space. Then by definition, it is closed with respect to linear 
combinations. Hence it is a subspace. □ 

When determining spanning sets the following theorem proves useful. 



In other words, this theorem claims that any subspace that contains a set of vectors must 
also contain the span of these vectors. 

The following example will show that two spans, described differently, can in fact be 
equal. 
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Solution. We will use Theorem 9.15 to show that U C W and W C U. It will then follow 
that U = W. 


1. ucw 

Notice that 2 p(x) — q(x) and p{x) + 3 q(x) are both in W = span {p(x), q(x)}. Then 
by Theorem 9.15 W must contain the span of these polynomials and so U C W. 

2. W C U 
Notice that 

3 2 

p(x) = - (2 p(x) - q(x)) + - (p(x) + 3q(x)) 

1 2 

q(x ) = -J (2 p(x) - q(x)) + - (p(x) + 3 q(x)) 

Hence p(x),q(x) are in span (2p(x) — q(x),p(x) + 3 q(x)}. By Theorem 9.15 U must 
contain the span of these polynomials and so W C U. 

□ 

To prove that a set is a vector space, one must verify each of the axioms given in Definition 
9.1. This is a cumbersome task, and therefore a shorter procedure is used to verify a subspace. 


Procedure 9.17: Subspace Test 


Suppose W is a subset of a vector space V. The W is a subspace ofW if the following 
three conditions hold, using the operations ofV: 

1. The additive identity 0 of V is contained in W. 

2. For any vectors w x , w 2 in W, Wi + w 2 is also in W. 

3. For any vector w\ in W and scalar a, the product aw\ is also in W. 

Therefore it suffices to prove these three steps to show that a set is a subspace. 
Consider the following example. 


Example 9.18: Improper Subspaces 


Let V be an arbitrary vector space. Then V is a subspace of itself. Similarly, the set 
jo j containing only the zero vector is also a subspace. 

Solution. Using the subspace test in Procedure 9.17 we can show that V and joj are 
subspaces of V. 

The conditions are clearly satisfied by V, since V satisfies the vector space axioms. 
Therefore V is a subspace. 

Let’s consider the set joj. 


398 


1. The vector 0 is clearly contained in < 0 >, so the first condition is satisfied. 


2. Let W\, w 2 be in < 0 >. Then W\ — 0 and w 2 = 0 and so 


wi + w 2 = 0 + 0 = 0 


It follows that the sum is contained in < 0 > and the second condition is satisfied. 


3. Let w\ be in < 0 > and let a be an arbitrary scalar. Then 


aw i = a0 = 0 

Hence the product is contained in jo j and the third condition is satisfied. 

It follows that jo j is a subspace of V. □ 

The two subspaces described above are called improper subspaces. Any subspace of 
a vector space V which is not equal to V or jo j is called a proper subspace. 

Consider another example. 


Example 9.19: Subspace of Polynomials 


Let P 2 be the vector space of polynomials of degree two or less. Let W C P 3 be all 
polynomials of degree two or less which have 1 as a root. Show that W is a subspace 

of P 3 . 


Solution. First, express W as follows: 

W = {p(x) = ax 2 + bx + c, a, b , c, G R|p(l) = 0} 

We need to show that W satisfies the three conditions of Procedure 9.17. 

1. The zero polynomial of P 2 is given by 0(x) = Ox * 1 2 + Ox + 0 = 0. Clearly 0(1) = 0 so 
0(x) is contained in W. 

2. Let p(x),q(x) be polynomials in W. It follows that p( 1) = 0 and q( 1) = 0. Now 
consider p(x) + q{x). Let r(x) represent this sum. 

r(i) = p(l) + g(l) 

= 0 + 0 
= 0 


Therefore the sum is also in W and the second condition is satisfied. 
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3. Let p{x) be a polynomial in IT and let a be a scalar. It follows that p( 1) = 0. Consider 
the product ap(x). 

ap( 1) = a(0) 

= 0 


Therefore the product is in IT and the third condition is satisfied. 
It follows that IT is a subspace of P 2 . 


□ 


9.2.1. Exercises 


1. Let V be a vector space and suppose {Ti, • • • , Xk] is a set of vectors in V. Show that 

0 is in span {x\ , • • • , . 

2. Determine if p(x) = 4x 2 — x is in the span given by 

span {x 2 + x, x 2 — 1, — x + 2} 

3. Determine if p(x) = —x 2 + x + 2 is in the span given by 

span {x 2 + x + 1, 2x 2 + x} 


4. Determine if A 


1 3 
0 0 


is in the span given by 


rn o 
iL o i 


0 1 
1 0 


1 0 
1 1 


0 1 
1 1 


5. Show that the spanning set in Question 4 is a spanning set for M 22 , the vector space 
of all 2x2 matrices. 

6. Let M = {u = M3, U 4 ) G l 4 : |wi| < 4} . Is M a subspace of M 4 ? 

7. Let M = {u = (ui, w 2 , 7/3, 114 ) G M 4 : sin (ui) = 1} . Is M a subspace of M 4 ? 

8. Let IT be a subset of M 22 given by 

IT = {A\A G M 22 ,A t = A} 


In words, IT is the set of all symmetric 2x2 matrices. Is IT a subspace of M 22 ? 
9. Let IT be a subset of M 22 given by 


IT = 


a b 
c d 


| a, 6, c, d G M, a + b = c + d 


Is IT a subspace of M 22 ? 
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10. Let W be a subset of P 3 given by 

W = { ax 3 + bx 2 + cx + d\a , b, c, d G M, d = 0} 

Is bb a subspace of P 3 ? 

11. Let IL be a subset of P 3 given by 

W = (p(x) = ax 3 + bx 2 + cx + d\a, b,c,de M,p(2) = l} 
Is W a subspace of P 3 ? 

9.3 Linear Independence and Bases 


Outcomes 


A. Determine if a set is linearly independent. 

B. Extend a linearly independent set and shrink a spanning set to a basis of a given 
vector space. 

In this section, we will again explore concepts introduced earlier in terms of M n and 
extend them to apply to abstract vector spaces. 



The set of vectors is called linearly dependent if it is not linearly independent. 
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Solution. To determine if this set S is linearly independent, we write 

a(x * 2 + 2x — 1) + b( 2x 2 — x + 3) = Ox 2 + Ox + 0 

If it is linearly independent, then a = b = 0 will be the only solution. We proceed as follows. 

a(x 2 + 2x — 1) + b( 2x 2 — x + 3) = Ox 2 + Ox + 0 

ax 2 + 2 ax — a + 2 bx 2 — bx + 3b = Ox 2 + Ox + 0 

(a + 2 b)x 2 + (2a — b)x — a + 36 = Ox 2 + Ox + 0 

It follows that 

a + 2b = 0 
2a — b = 0 
—a + 3b = 0 

The augmented matrix and resulting reduced row-echelon form are given by 


1 2 

0 ' 


1 

o 

0 ' 

2 -1 

0 


0 1 

0 

1 

CO 

0 


o 

o 

0 


Hence the solution is a = b = 0 and the set is linearly independent. □ 

If we know that one particular set is linearly independent, we can use this information 
to determine if a related set is linearly independent. Consider the following example. 


Example 9.22: Related Independent Sets 


Let V be a vector space and suppose S C V is a set of linearly independent vectors 
given by S = {u, v, w}. Let R C V be given by R = [2u — w, w + v, 3v + . Show 

that R is also linearly independent. 


Solution. To determine if R is linearly independent, we write 

a{2u — w) + b(w + v) + c( 3v + ^u) = 0 

If the set is linearly independent, the only solution will bea = 6 = c = 0. We proceed as 
follows. 

a{2u — w) + b{w + v) + c(3v + ^u) = 0 

2 au — aw + bw + bv + 3 cv + -cu = 0 
(2a + -c)u + (b + 3c)v + (—a + b)iu = 0 
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We know that the set S = {u,v,w} is linearly independent, which implies that the 
coefficients in the last line of this equation must all equal 0. In other words: 

1 

2 a+ -c = 0 

b + 3c = 0 
— cl -\- b = 0 

The augmented matrix and resulting reduced row-echelon form are given by: 


' 2 0 | 

0 ' 


o 

o 

o ' 

0 1 3 

0 


0 1 0 

0 

.-110 

0 . 


0 0 1 

0 


Hence the solution is a = b = c = 0 and the set is linearly independent. □ 

Consider the span of a linearly independent set of vectors. Suppose we take a vector which 
is not in this span and add it to the set. The following lemma claims that the resulting set 
is still linearly independent. 



Proof. Suppose Y^l=i + dv = 0. It is required to verify that each c t — 0 and that cl = 0. 
But if d 0, then you can solve for v as a linear combination of the vectors, {fti, • • • ,Uk}, 


k 



i= 1 


contrary to the assumption that v is not in the span of the fq. Therefore, d = 0. But then 
c iUi = 0 and the linear independence of {ffi, • • • , Uk} implies each c t — 0 also. □ 

Consider the following example. 
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Example 9.24: Adding to a Linearly Independent Set 


1 0 
0 0 


0 1 
0 0 


Let S C M 2 2 be a linearly independent set given by 

S = 

Show that the set R C M 22 given by 

R = 

is also linearly independent. 


1 0 
0 0 


0 1 
0 0 


0 0 
1 0 


Solution. Instead of writing a linear combination of the matrices which equals 0 and showing 
that the coefficients must equal 0, we can instead use Lemma 9.23. 

To do so, we show that 

" 0 
1 

Write 

' 0 
1 


a b 
0 0 

Clearly there are no possible a, b to make this equation true. Hence the new matrix does 
not lie in the span of the matrices in S. By Lemma 9.23, R is also linearly independent. □ 

Recall the definition of basis, considered now in the context of vector spaces. 



0 

0 


^ span 


1 0 
0 0 


0 1 
0 0 


0 

0 


= a 


1 0 
0 0 

a 0 
0 0 


+ b 


+ 


0 1 
0 0 

0 b 
0 0 


Consider the following example. 
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Example 9.26: Polynomials of Degree Two 


Let P2 be the set polynomials of degree no more than 2. We can write P2 = 
span {re 2 , x, 1} . Is {x 2 , x, 1} a basis for P 2 ? 


Solution. It can be verified that P 2 is a vector space defined under the usual addition and 
scalar multiplication of polynomials. 

Now, since P 2 = span{x 2 ,x, 1}, the set {x 2 ,x, 1} is a basis if it is linearly independent. 
Suppose then that 

ax 2 + bx + c = Ox 2 + Ox + 0 

where a, b, c are real numbers. It is clear that this can only occur if a = b = c = 0. Hence 
the set is linearly independent and forms a basis of P 2 . □ 

The next theorem is an essential result in linear algebra and is called the exchange 
theorem. 



Proof. The proof will proceed as follows. First, we set up the necessary steps for the proof. 
Next, we will assume that r > s and show that this leads to a contradiction, thus requiring 
that r < s. 

Define span{?/i, • • • ,y s } = V. Since each x* is in span{?/i, • • • ,y s }, it follows there exist 
scalars ci, ■ • • , c s such that 

S 

x\ = y Cji/i (9.1) 

i = 1 

Note that not all of these scalars q can equal zero. Suppose that all the q = 0. Then it 
would follow that X\ = 0 and so {xi, • • • ,x r } would not be linearly independent. Indeed, if 
xi = 0, lxi + y")( =9 Ox* = Xi = 0 and so there would exist a nontrivial linear combination of 
the vectors {xi, • • • ,x r } which equals zero. Therefore at least one q is nonzero. 

Say Ck 7^ 0. Then solve 9.1 for y k and obtain 

{ s-1 vectors here 

xi,Vh ■ ■ ■ ,Vk-i,yk+i, 

Define {z \ , • • • , to be 

{pl ) ) %s— l} \jjli 1 Uk—li Vk+lt j Vs} 

Now we can write 

y k e span{xi,q, • • • ,x s _i} 
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Therefore, span {x 1; Zi, ■ ■ ■ , z s -i } = V. To see this, suppose v G V. Then there exist 
constants ci, • ■ ■ , c s such that 

s-l 

v = ^2 CiZi + c s y k . 

1=1 

Replace this y k with a linear combination of the vectors {xi , zj , • • ■ ,z s _ 1 } to obtain v G 
span {fi, z\, ■ ■ ■ , z s —i} . The vector y k , in the list {yi, • • • , y s } , has now been replaced with 
the vector X\ and the resulting modihed list of vectors has the same span as the original list 
of vectors, {yi, ■ ■ ■ ,y s } . 

We are now ready to move on to the proof. Suppose that r > s and that span {xi, • • • , xi, z\, ■ ■ ■ , z p } 
V, where the process established above has continued. In other words, the vectors z[, ■ ■ ■ ,z p 
are each taken from the set {yi, • • • , y s } and l + p = s. This was done for l = 1 above. Then 
since r > s, it follows that l < s < r and so l + 1 < r. Therefore, X| +] is a vector not in 
the list, {xi, • • • , X/} and since span {xi, • • • , X;, ii, • • • , z p } = V there exist scalars, c* and 
dj such that 

1 p 

xi+i = ^2 cA + ^2 djZj- (9.2) 

*= 1 3 = 1 

Not all the dj can equal zero because if this were so, it would follow that {xi, • • • , x r } would 
be a linearly dependent set because one of the vectors would equal a linear combination of 
the others. Therefore, 9.2 can be solved for one of the f), say z k , in terms of x )+ 1 and the 
other Zi and just as in the above argument, replace that z t with x i+1 to obtain 

f p-1 vectors here ^ 

Xi, • ■■Xi,Xi+i,'z 1 , ■ ■ 4+1, • • • ,Zp > = V 


Continue this way, eventually obtaining 


span {xi, • • • , x s } = V. 

But then x r G span{xi,-- - ,x s } contrary to the assumption that {xi,-- - ,x r } is linearly 
independent. Therefore, r < s as claimed. □ 


The following corollary follows from the exchange theorem. 


Corollary 9.28: Two Bases of the Same Length 


Let Ui, ■ ■ ■ , u m , v 1: • • • , v n G V. If {wi, ■ • • , u m } and {iTi, • • 
then m — n. 

■ , v n } are two bases for V, 


Proof. By Theorem 9.27, m < n and n < m. Therefore m — n. □ 

This corollary is very important so we provide another proof independent of the exchange 
theorem above. 
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Proof. Suppose n > m. Then since the vectors {wi, • • • ,u m } span V, there exist scalars c {j 
such that 

m 

'y ^ c ij u i = v j- 

i=l 

Therefore, 

n n m 

djVj = 0 if and only if CijdjUi = 0 

j= 1 j=l i= 1 


if and only if 


( 'd. c ijdj 


Hi 


i= 1 


d = 1 


0 


Now since {wi, • • • ,u n } is independent, this happens if and only if 


n 

'd, c ijdj = 0, i — 1, 2, • • ■ , m. 
j= 1 


ffowever, this is a system of m equations in n variables, d\, ■ ■ ■ ,d n and m < n. Therefore, 
there exists a solution to this system of equations in which not all the dj are equal to zero. 
Recall why this is so. The augmented matrix for the system is of the form [ C | 0 ] where 
C is a matrix which has more columns than rows. Therefore, there are free variables and 
hence nonzero solutions to the system of equations. However, this contradicts the linear 
independence of {«i, • • • ,u m }. Similarly it cannot happen that m > n. □ 


Given the result of the previous corollary, the following definition follows. 



Notice that the dimension is well defined by Corollary 9.28. It is assumed here that 
n < oo and therefore such a vector space is said to be finite dimensional. 


Example 9.30: Dimension of a Vector Space 


Let P 2 be the set of all polynomials of degree at most 2. Find the dimension of P 2 . 


Solution. If we can find a basis of P 2 then the number of vectors in the basis will give the 
dimension. Recall from Example 9.26 that a basis of P 2 is given by 

S = {x 2 , x, l} 

There are three polynomials in S and hence the dimension of P 2 is three. □ 

It is important to note that a basis for a vector space is not unique. A vector space can 
have many bases. Consider the following example. 
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Example 9.31: A Different Basis for Polynomials of Degree Two 


Let P 2 be the polynomials of degree no more than 2. Is { x 2 + x + 1, 2x + 1, 3x 2 + 1} 
a basis for P 2 ? 


Solution. Suppose these vectors are linearly independent but do not form a spanning set for 
P 2 . Then by Lemma 9.23, we could find a fourth polynomial in P 2 to create a new linearly 
independent set containing four polynomials. However this would imply that we could find 
a basis of P 2 of more than three polynomials. This contradicts the result of Example 9.30 
in which we determined the dimension of P 2 is three. Therefore if these vectors are linearly 
independent they must also form a spanning set and thus a basis for P 2 . 

Suppose then that 


a (x 2 + x + l) + b (2x + 1) + c (3x 2 + l) = 0 

(a + 3c) x 2 + (a + 2b) x + (a + b + c) = 0 

We know that {l,x,x 2 } is linearly independent, and so it follows that 

a + 3c = 0 

a + 2b = 0 

a + b + c = 0 

and there is only one solution to this system of equations, a = b = c = 0. Therefore, these 
are linearly independent and form a basis for P 2 . □ 

Consider the following theorem. 


Theorem 9.32: Every Subspace has a Basis 


Let V be a nonzero subspace of a hnite dimensional vector space W. Suppose W has 
dimension n. Then V has a basis with no more than n vectors. 


Proof. Let v\ £ V where v\ 0. If span{ui} = V, then it follows that {Ei} is a basis for 
V. Otherwise, there exists u 2 G V which is not in span{ui}. By Lemma 9.23 {ui,u 2 } is 
a linearly independent set of vectors. Then {ui,u 2 } is a basis for V and we are done. If 
span {hi, v 2 } 7 ^ V, then there exists v 3 ^ span {fTi, v 2 } and {vi,v 2 ,v 3 } is a larger linearly 
independent set of vectors. Continuing this way, the process must stop before n + 1 steps 
because if not, it would be possible to obtain n + 1 linearly independent vectors contrary to 
the exchange theorem, Theorem 9.27. □ 

The following theorem claims that a spanning set of a vector space V can be shrunk down 
to a basis of V. Similarly, a linearly independent set within V can be enlarged to create a 
basis of V. 
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Theorem 9.33: Basis of V 


IfV = span {-ui, • • • , u n } is a vector space, then some subset of {ui, ■ ■ ■ , 4} is a basis 
for V. Also, if {ih, ■ ■ ■ ,u k } E V is linearly independent and the vector space is finite 
dimensional, then the set {u\, ■ ■ ■ ,u k }, can be enlarged to obtain a basis ofV. 


Proof. Let 


S = {E C {w 1; • • • , u n } such that span {E} = V}. 


For E £ S, let \E\ denote the number of elements of E. Let 


m = min {| T | such that E G S}. 


Thus there exist vectors 


such that 


{ C 1 - , Cm} E j 'Wn} 

span {Fi, • • • ,v m } = V 


and m is as small as possible for this to happen. If this set is linearly independent, it follows 
it is a basis for V and the theorem is proved. On the other hand, if the set is not linearly 
independent, then there exist scalars, Ci, • • ■ , c m such that 


m 

o=x: c Ai 

2=1 


and not all the q are equal to zero. Suppose q, 7 ^ 0. Then solve for the vector q, in terms 
of the other vectors. Consequently, 


V = span {q, • • • , v k -i,Vk+i, ■ • • , On} 


contradicting the dehnition of m. This proves the first part of the theorem. 

To obtain the second part, begin with {«i, • • • , Ek} and suppose a basis for V is 

{vi, — ,v n } 


If 


then k 


span (ui, • • • ,u k } 
n. If not, there exists a vector 


V, 


Uk+ 1 f span {«!,••• , u k } 

Then from Lemma 9.23, {?ii , • ■ • ,u k ,u k+ 1 } is also linearly independent. Continue adding 
vectors in this way until n linearly independent vectors have been obtained. Then 

span {ui, • • • , u n } = V 

because if it did not do so, there would exist u n + 1 as just described and {ffi, • • • , u n + 1 } would 
be a linearly independent set of vectors having n + 1 elements. This contradicts the fact that 
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{Fi, • • • , v n j is a basis. In turn this would contradict Theorem 9.27. Therefore, this list is a 
basis. □ 

Consider the following example. 



Solution. First we need to show that S spans P 2 . Let ax 2 + bx + c be an arbitrary polynomial 
in P 2 . Write 

ax 2 + bx + c = r( 1) + s(x) + t(x 2 ) + u(x 2 + 1) 

Then, 

ax 2 + bx + c = r(l) + s(a:) + t(a; 2 ) + m(x 2 + 1) 

= (t + u)x 2 + s(x) + (r + u) 

It follows that 

a = t + u 
b = s 
c = r + u 

Clearly a solution exists for all a, b, c and so S' is a spanning set for P 2 . By Theorem 9.33, 
some subset of S is a basis for P 2 . 

Recall that a basis must be both a spanning set and a linearly independent set. Therefore 
we must remove a vector from S keeping this in mind. Suppose we remove x from S. The 
resulting set would be {l,x 2 ,a: 2 + 1}. This set is clearly linearly dependent (and also does 
not span P 2 ) and so is not a basis. 

Suppose we remove x 2 + 1 from S. The resulting set is {l,x,x 2 } which is both linearly 
independent and spans P 2 . Hence this is a basis for P 2 . Note that removing l,x 2 , or x 2 + 1 
will result in a basis. □ 

Recall Example 9.24 in which we added a matrix to a linearly independent set to create 
a larger linearly independent set. By Theorem 9.33 we can extend a linearly independent 
set to a basis. 
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Solution. Recall from the solution of Example 9.24 that the set R C M 2 2 given by 


R = 


1 0 
0 0 


0 1 
0 0 


0 0 
1 0 


is also linearly inde 
set. In particular, 


Dendent. However this set is still not a basis for M 22 as it is not a spanning 
H ^ is not in spanR. Therefore, this matrix can be added to the set 


by Lemma 9.23 to obtain a new linearly independent set given by 


T = 


1 0 
0 0 


0 1 
0 0 


0 0 
1 0 


0 0 
0 1 


This set is linearly independent and now spans M 22 . Hence T is a basis. 


9.3.1. Exercises 


1. Determine if the following set is linearly independent. If it is linearly dependent, write 
one vector as a linear combination of the other vectors in the set. 

{x + 1, x 2 + 2, x 2 — x — 3} 

2. Determine if the following set is linearly independent. If it is linearly dependent, write 
one vector as a linear combination of the other vectors in the set. 

{x 2 + x, — 2x 2 — 4x — 6 , 2x — 2} 


3. Determine if the following set is linearly independent. If it is linearly dependent, write 
one vector as a linear combination of the other vectors in the set. 


1 2 
0 1 


-7 

-2 


2 

-3 


4 0 
1 2 


4. Determine if the following set is linearly independent. If it is linearly dependent, write 
one vector as a linear combination of the other vectors in the set. 


1 0 
0 1 


0 1 
0 1 


1 0 
1 0 


0 0 
1 1 


5. If you have 5 vectors in M 5 6 7 and the vectors are linearly independent, can it always be 
concluded they span M 5 ? 

6 . If you have 6 vectors in M 5 , is it possible they are linearly independent? Explain. 

7. Let P 3 be the polynomials of degree no more than 3. Determine which of the following 
are bases for this vector space. 
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(a) { x + 1, x 3 + x 2 + 2x, x 2 + x, x 3 + x 2 + a:} 

(b) {x 3 + 1, x 2 + x, 2x 3 + a; 2 , 2a; 3 — a; 2 — 3a; + 1} 

8. In the context of the above problem, consider polynomials 

{ ciiX 3 + biX 2 + Qa; + d, i = 1, 2, 3, 4} 

Show that this collection of polynomials is linearly independent on an interval [s, t] if 
and only if 


Ci\ 

bi 

Cl 

d 


b 2 

c 2 

d, 


bs 

C 3 

d, 

04 

b. 

c 4 

d 


is an invertible matrix. 


9. Let the field of scalars be Q, the rational numbers and let the vectors be of the form 
a + b \/ 2 where a, b are rational numbers. Show that this collection of vectors is a vector 
space with field of scalars Q and give a basis for this vector space. 

10. Suppose V is a finite dimensional vector space. Based on the exchange theorem above, 
it was shown that any two bases have the same number of vectors in them. Give 
a different proof of this fact using the earlier material in the book. Hint: Suppose 
{xi, • " , x n } and {yi, ■ — , y m } are two bases with m < n. Then define 

0 : IT ^ V, ^ ^ V 


by 

n m 

(t> (a) = Yl ak * k ’ ^ (f) = 5Z b jyj 

k = 1 j = 1 

Consider the linear transformation, i^~ l o <fi. Argue it is a one to one and onto mapping 
from M n to M m . Now consider a matrix of this linear transformation and its row reduced 
echelon form. 


412 



A. Some Prerequisite Topics 


The topics presented in this section are important concepts in mathematics and therefore 
should be examined. 

A.l Sets and Set Notation 


A set is a collection of things called elements. For example (1,2, 3, 8} would be a set con- 
sisting of the elements 1,2,3, and 8. To indicate that 3 is an element of {1,2, 3, 8}, it is 
customary to write 3 G {1,2, 3, 8}. We can also indicate when an element is not in a set, by 
writing 9 ^ {1,2, 3, 8} which says that 9 is not an element of {1,2, 3, 8}. Sometimes a rule 
specifies a set. For example you could specify a set as all integers larger than 2. This would 
be written as S = {s 6 Z : a: > 2} . This notation says: S is the set of all integers, x, such 
that x > 2. 

Suppose A and B are sets with the property that every element of A is an element of B. 
Then we say that A is a subset of B. For example, {1, 2, 3, 8} is a subset of {1, 2, 3, 4, 5, 8} . 
In symbols, we write {1, 2, 3, 8} C {1, 2, 3, 4, 5, 8} . It is sometimes said that “A is contained 
in B v or even U B contains A” . The same statement about the two sets may also be written 
as {1,2, 3, 4, 5, 8} D {1,2, 3, 8}. 

We can also talk about the union of two sets, which we write as A U B. This is the 
set consisting of everything which is an element of at least one of the sets, A or B. As an 
example of the union of two sets, consider {1, 2, 3, 8} U {3, 4, 7, 8} = {1, 2, 3, 4, 7, 8}. This set 
is made up of the numbers which are in at least one of the two sets. 

In general 

AU B = {x : x E A or x E B} 

Notice that an element which is in both A and B is also in the union, as well as elements 
which are in only one of A or 5. 

Another important set is the intersection of two sets A and B, written An B. This set 
consists of everything which is in both of the sets. Thus {1,2, 3, 8} D {3,4, 7, 8} = {3,8} 
because 3 and 8 are those elements the two sets have in common. In general, 


An B = {x : x e A and x G B} 

If A and B are two sets, A\B denotes the set of things which are in A but not in B. 
Thus 

A\B = {x E A \ x £ B} 

For example, if A = {1, 2, 3, 8} and B = {3, 4, 7, 8}, then A \ B — { 1,2, 3, 8} \ {3, 4, 7, 8} = 
{1,2}. 
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A special set which is very important in mathematics is the empty set denoted by 0. The 
empty set, 0, is defined as the set which has no elements in it. It follows that the empty set 
is a subset of every set. This is true because if it were not so, there would have to exist a 
set A, such that 0 has something in it which is not in A. However, 0 has nothing in it and 
so it must be that 0 C A. 

We can also use brackets to denote sets which are intervals of numbers. Let a and b be 
real numbers. Then 

• [a, 6] = {x G K. : a < x < b} 

• [a, b) = {x E W : a < x < b} 

• (a, b) = {x G R : a < x < b} 

• (d,t] = {i6l:a<i<i)} 

• [a, oo) = {x e R : x > a} 

• (— oo, a] = {x G M : x < a} 

These sorts of sets of real numbers are called intervals. The two points a and b are called 
endpoints, or bounds, of the interval. In particular, a is the lower bound while b is the upper 
bound of the above intervals, where applicable. Other intervals such as (— oo, b) are defined 
by analogy to what was just explained. In general, the curved parenthesis, (, indicates the 
end point is not included in the interval, while the square parenthesis, [, indicates this end 
point is included. The reason that there will always be a curved parenthesis next to oo or 
— oo is that these are not real numbers and cannot be included in the interval in the way a 
real number can. 

To illustrate the use of this notation relative to intervals consider three examples of 
inequalities. Their solutions will be written in the interval notation just described. 



Solution. We need to find x such that 2x + 4 < x — 8. Solving for x, we see that x < —12 is 
the answer. This is written in terms of an interval as (— oo, —12], □ 

Consider the following example. 



Solution. We need to find x such that (x + 1) (2x — 3) >0. The solution is given by x < — 1 
or x > |. Therefore, x which fit into either of these intervals gives a solution. In terms of 
set notation this is denoted by (— oo, —1] U [§, oo). □ 
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Consider one last example. 



Solution. This inequality is true for any value of x where x is a real number. We can write 
the solution as M. or (— oo, oo) . □ 

In the next section, we examine another important mathematical concept. 

A. 2 Well Ordering and Induction 


We begin this section with some important notation. Summation notation, written Ei=ib 
represents a sum. Here, i is called the index of the sum, and we add iterations until i — j. 
For example, 

j 

= 1 + 2 + ---+J 

i— 1 


Another example: 

3 

ctn + Ol2 + ®13 = Ol i 

i = 1 

The following notation is a specihc use of summation notation. 



Notice that since addition is commutative, Ej=i E[=i a o = EI=i Ej=i a q- 
We now consider the main concept of this section. Mathematical induction and well 
ordering are two extremely important principles in math. They are often used to prove 
significant things which would be hard to prove otherwise. 
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Definition A. 5: Well Ordered 


A set is well ordered if every nonempty subset S, contains a smallest element z having 
the property that z < x for all x E S. 


In particular, the set of natural numbers defined as 


M = {1,2,---} 


is well ordered. 

Consider the following proposition. 



This proposition claims that if a set has a lower bound which is a real number, then this 
set is well ordered. 

Further, this proposition implies the principle of mathematical induction. The symbol Z 
denotes the set of all integers. Note that if a is an integer, then there are no integers between 
a and a + 1. 



Proof. Let T consist of all integers larger than or equal to a which are not in S. The theorem 
will be proved if T = 0. If T 0 then by the well ordering principle, there would have 
to exist a smallest element of T, denoted as b. It must be the case that b > a since by 
definition, a ^ T. Thus b > a + 1, and so b — 1 > a and b — 1 ^ S because if 6 — 1 6 S', then 
6 — l + l = 6GS'by the assumed property of S. Therefore, b — 1 G T which contradicts the 
choice of b as the smallest element of T. (6 — 1 is smaller.) Since a contradiction is obtained 
by assuming T 0, it must be the case that T = 0 and this says that every integer at least 
as large as a is also in S. □ 

Mathematical induction is a very useful device for proving theorems about the integers. 
The procedure is as follows. 
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Procedure A. 8: Proof by Mathematical Induction 


Suppose S n is a statement which is a function of the number n, for n = 1, 2, • • • , and 
we wish to show that S n is true for all n > 1 . To do so using mathematical induction, 
use the following steps. 

1. Base Case: Show Si is true. 

2. Assume S n is true for some n, which is the induction hypothesis. Then, using 
this assumption, show that S n+ 1 is true. 

Proving these two steps shows that S n is true for all n = 1, 2, • • • . 


We can use this procedure to solve the following examples. 



Solution. By Procedure A. 8, we first need to show that this statement is true for n — 1. 
When n — 1, the statement says that 


X > 2 


1 (1 + 1 ) ( 2 ( 1 ) + 1 ) 

6 

6 

6 

1 


The sum on the left hand side also equals 1, so this equation is true for n — 1. 

Now suppose this formula is valid for some n > 1 where n is an integer. Hence, the 
following equation is true. 

jp. _ n(n + l) (2 n + 1) ^ ^ 

k= i ® 

We want to show that this is true for n + 1. 

Suppose we add (n + l) 2 to both sides of equation 1.1. 


n + 1 

E fc2 


k = 1 


^ fc 2 + (n + l) 2 

k= 1 


n (n + 1) (2 n + 1) 
6 


+ (n + l) 2 


The step going from the first to the second line is based on the assumption that the formula 
is true for n. Now simplify the expression in the second line, 


n ( n + 1) (2 n + 1) 
6 


+ (n + l) 2 
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This equals 


and 


Therefore, 


(n + 1) 


n (2 n + 1) 
6 


+ (n + 1) 


n (2 n + 1) . . 6 (n + 1) + 2 n 2 + n (n + 2) (2n + 3) 

“h \Ul “h 1 j 


6 


6 


6 


71+ 1 


fc=i 


(n + 1) (n + 2) (2n + 3) 
6 


(n + 1) ((n + 1) + 1) (2 (n + 1) + 1) 
6 


showing the formula holds for n + 1 whenever it holds for n. This proves the formula by 
mathematical induction. In other words, this formula is true for all n — 1, 2, • • • . □ 


Consider another example. 



Solution. Again we will use the procedure given in Procedure A. 8 to prove that this statement 
is true for all n. Suppose n — 1. Then the statement says 

1 1 

2 < 7! 

which is true. 

Suppose then that the inequality holds for n. In other words, 

13 2n — 1 1 

2 ' 4 2n yj2n + 1 

is true. 

Now multiply both sides of this inequality by This yields 


13 2n — 1 2n + 1 1 2n + l y/2n + 1 

2 4 2 n 2n + 2 y/2n + 1 2n + 2 2n + 2 


The theorem will be proved if this last expression is less than 
only if 


1 

\j2n + 3 


2n + 1 

V2n + 3j 2n + 3 " (2n + 2) : 


> 


This happens if and 


which occurs if and only if (2 n + 2) 2 > (2 n + 3) (2 n + 1) and this is clearly true which may 
be seen from expanding both sides. This proves the inequality. □ 
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Let’s review the process just used. If S is the set of integers at least as large as 1 for which 
the formula holds, the first step was to show 1 € S' and then that whenever n G S, it follows 
n + 1 G S. Therefore, by the principle of mathematical induction, S contains [l,oo) fl Z, 
all positive integers. In doing an inductive proof of this sort, the set S is normally not 
mentioned. One just verifies the steps above. 
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n, 413 

U, 413 
\, 413 

row-echelon form, 26 
reduced row-echelon form, 26 
algorithm, 28 

Cramer’s rule, 131 
cross product, 178, 179 

area of parallelogram, 181 
coordinate description, 179 
geometric description, 179 
cylindrical coordinates, 371 

adjugate, 126 

De Moivre’s theorem, 283 

back substitution, 22 
base case, 417 
basic solution, 39 
basis, 196, 212, 404 

any two same size, 406 
box product, 184 

determinant, 107 
cofactor, 109 

expanding along row or column, 110 

matrix inverse formula, 126 

minor, 108 

product, 117 

row operations, 114 

cardioid, 366 

Cauchy Schwarz inequality, 164 
characteristic equation, 294 
chemical reactions 
balancing, 43 
classical adjoint, 126 

Cofactor Expansion, 110 
cofactor matrix, 125 
column space, 198 
complex eigenvalues, 312 
complex numbers 

absolute value, 278 
addition, 274 
argument, 280 
conjugate, 275 
conjugate of a product, 280 
modulus, 278, 280 
multiplication, 274 
polar form, 280 
roots, 283 
standard form, 273 
triangle inequality, 278 
component form, 158 
component of a force, 235, 238 
consistent system, 17 

diagonalizable, 307, 345 
dimension, 197 

dimension of vector space, 407 
direction vector, 157 
distance formula, 151 
properties, 152 
dot product, 162 
properties, 163 

eigenvalue, 293 
eigenvalues 

calculating, 295 
eigenvector, 293 
eigenvectors 

calculating, 295 
elementary matrix, 86 
inverse, 89 

elementary operations, 18 
elementary row operations, 25 
empty set, 414 
exchange theorem, 196, 405 

held axioms, 274 
force, 231 

Fundamental Theorem of Algebra, 273 
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Gauss-Jordan Elimination, 34 
Gaussian Elimination, 34 
general solution, 268 
solution space, 267 

hyper-planes, 15 

idempotent, 100 
improper subspace, 399 
included angle, 166 
inconsistent system, 17 
induction hypothesis, 417 
injection, 262 
inner product, 162 
intersection, 413 
intervals 

notation, 414 

inverses and determinants, 128 
isomorphism, 264 

kernel, 201, 266 
Kirchoff’s law, 54 
Kronecker symbol, 77 

Laplace expansion, 110 
leading entry, 25 
least square approximation, 225 
linear combination, 40 
linear independence, 189 

enlarging to form a basis, 409 
linear independent, 211 
linear transformation, 240 
composite, 252 
image, 261 
matrix, 242 
range, 261 
rotation, 255 
linearly dependent, 401 
linearly independent, 401 
lines 

parametric equation, 159 
symmetric form, 160 
vector equation, 157 

main diagonal, 113 
Markov matrices, 321 


mathematical induction, 416, 417 
matrix, 23, 59 
addition, 61 

augmented matrix, 23, 24 
coefficient matrix, 23 
column space, 198 
commutative, 74 
components of a matrix, 60 
conformable, 68 
diagonal matrix, 307 
dimension, 23 
entries of a matrix, 60 
equality, 61 
equivalent, 37 
Ending the inverse, 81 
identity, 77 
inverse, 78 
invertible, 78 
kernel, 201 
lower triangular, 113 
main diagonal, 307 
null space, 201 
orthogonal, 124, 214 
orthogonally diagonalizablc, 318 
properties of addition, 62 
properties of scalar multiplication, 64 
properties of transpose, 76 
raising to a power, 316 
rank, 41 
row space, 198 
scalar multiplication, 63 
skew symmetric, 76 
square, 60 
symmetric, 76 
transpose, 75 
upper triangular, 113 
matrix multiplication, 68 
ijth entry, 72 
properties, 74 
vectors, 66 

migration matrix, 321 
multiplicity, 294 

Newton, 231 
nilpotent, 124 
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nondefective, 345 
nontrivial solution, 38 
null space, 201, 266 
nullity, 203 

one to one, 262 

onto, 262 

orthogonal, 212 

orthogonal complement, 222 

orthogonality and minimization, 225 

orthogonally diagonalizable, 345 

orthonormal, 212 

parallelepiped, 183 
volume, 183 
parameter, 32 
particular solution, 266 
permutation matrices, 86 
pivot column, 27 
pivot position, 27 
plane 

normal vector, 174 
scalar equation, 176 
vector equation, 174 
polar coordinates, 361 
polynomials 

factoring, 286 
position vector, 142 
principal axes, 352 
principal axis 

quadratic forms, 350 
proper subspace, 399 

QR factorization, 347 
quadratic form, 350 
quadratic formula, 287 

range of matrix transformation, 261 
reflection 

across a given vector, 261 
regression line, 226 
resultant, 231 
right handed system, 178 
row operations, 25, 114 
row space, 198 

scalar, 16 


scalar product, 162 
scalars, 146 
set notation, 413 
similar matrix, 301 
skew lines, 14 
solution space, 266 
span, 188, 210, 395 
spectrum, 293 
speed, 233 

spherical coordinates, 372 
state vector, 322 
subset, 394 
subspace, 194, 397 
basis, 196, 212 
dimension, 197 
has a basis, 408 
span, 195 

summation notation, 415 
surjection, 262 
symmetric matrix, 339 
system of equations, 16 
homogeneous, 17 
matrix form, 67 
solution set, 17 
vector form, 65 

the form AX=B, 67 
triangle inequality, 165 
trigonometry 

sum of two angles, 258 
trivial solution, 38 

union, 413 
unit vector, 153 

vector, 143 

addition, 145 

addition, geometric meaning, 144 
components, 143 
corresponding unit vector, 153 
length, 153 
orthogonal, 168 
perpendicular, 168 
points and vectors, 142 
projection, 170 
scalar multiplication, 146 
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subtraction, 146 
vector space, 380 
dimension, 407 
vector space axioms, 379 
vectors, 64 

basis, 196, 212 
column, 64 

linear independent, 189, 211 
linearly dependent, 191 
orthonormal, 212 
row vector, 64 
span, 188, 210 
velocity, 233 

well ordered, 416 
work, 235 

zero matrix, 60 
zero vector, 146 
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